- This topic has 0 replies, 1 voice, and was last updated 3 years, 4 months ago by .
-
Topic
-
Running v6.1 and this is the first time we’ve encountered an issue where a site Monitor Daemon crashes on multiple sites. The GUI would show all red connections & processes but the lock manager would be green.
We could restart the Monitor Daemon and it would restore itself. Digging in deeper on this, we would shut the site down (hcisitectl -K) then bring it back up (hcisitectl -S) to find some processes with abnormal exits while other processes were unaffected.
- No log files created in /error dir
- No abnormal errors in Monitor log file
- No abnormal errors or crashes in the process dir
For preventative measures, we siteinit’d all our sites to avoid any anomalies and are continuing to monitor performance, though we’re simply without a root cause and were ambushed on these failures because any alerts would have been tied to the Monitor Daemon (that wasn’t running).
My guess is that the lock on the monitor process was corrupted (?) which held a session and eventually crashed it?
As an aside to this question, what is the /Lock directory intended for? I understand there is a known issue with session locks not properly exiting on this version. I’m curious if the contents of /Lock store these sessions and if so, what is the harm in routinely wiping files older than 10 days?
- You must be logged in to reply to this topic.