Process alert – intentional vs panic/crash

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Process alert – intentional vs panic/crash

  • Creator
    Topic
  • #52389
    Lawrence Williams
    Participant

      Anyone know a way of setting up a Process or Engine Status Alert to differentiate between a process that is brought down intentionally (i.e.  I bring down a process to implement a change in a tclproc) vs a process crashing or panic ?

    Viewing 3 reply threads
    • Author
      Replies
      • #74038
        David Barr
        Participant

          The “engine status” alert won’t tell you this. You could put something in an exec action that searches for the word “panic” in the process log and takes an appropriate action based on what it finds.

        • #74039
          Richard Hart
          Participant

            Hi Lawrence

            Checkout the process ‘exit_log’ file.

            A normal exit will be something like

            Quote:

            Normal exit at Wed Apr  6 09:33:41 2011

            and a Panic exit will have something like

            Quote:

            Abnormal exit – Cloverleaf software panic at Wed Apr  6 09:33:05 2011

          • #74040
            Russ Ross
            Participant

              A technique you might want to consider that we use is that our alerts call a script that does the actions that we want to have done like cycle the thread/process and send out an email/page.

              In addition to those universally basic steps our script also checks if the alert has been manually toggled off by checking for a whatever.off file in our alerts directory would be applicable.

              This allows us to selectively toggle off desired alerts while doing various downtime maintenance like stopping a process or thread intentionally.

              This does require you to remember to toggle the alert back on when done so I usually use the at command to toggle the alert back on by removing the whatever.off file some amount of time in the future should I forget to get rid of it.

              Still to help forgetting to toggle alerts back on when done we also list on the screen during hci login what alerts have been toggled off by showing all the *.off files in all the site alert directories.

              If we are doing extensive maintenance in a site we turn off alerts for the entire site by laoding a off.alrt file instead of the default.alrt file when relaunching the monitor deamon.

              So iff $HCISITEDIR/Alerts/off.alrt is an alert file with no alerts in it then to turn off alerts for the entire site you can:

                  hcisitectl -k m -s m -A “a=-cl ‘off.alrt'”

              When ready to turn alerts back on for the site then you can:

                  hcisitectl -k m -s m

              Even though this does not directly answer you question it might be something to get you the desired outcome.

              Russ Ross
              RussRoss318@gmail.com

            • #74041
              Michael Hertel
              Participant

                Quote:

                When ready to turn alerts back on for the site then you can:

                   hcisitectl -k m -s m

                Wow, you learn something new everyday!  ðŸ˜€

                I didn’t realize you could combine the stop/start in the same command.

                Thanks Russ!

            Viewing 3 reply threads
            • The forum ‘Cloverleaf’ is closed to new topics and replies.