Alerts failing when engine panics

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Alerts failing when engine panics

  • Creator
    Topic
  • #47540
    Jeff Shields
    Participant

      We are currently in the process of testing the use of Alerts to notify us when there are messages in the error database or stuck in the recovery database for a period of time, or when a thread goes into an unexpected state.

      They have been working well, but occasionally a process panics, and when this happens, this is not picked up by the Alerts – either the fact that the thread is down or that messages are building up in the recovery database.

      Is this known behaviour? Does anyone know how to work around it? Obviously, we need to be able to monitor when a process goes down…

      TIA,

      Jeff Shields

    Viewing 5 reply threads
    • Author
      Replies
      • #56091
        Anonymous
        Participant

          What alerts do you have set up?

        • #56092
          Jeff Shields
          Participant

            We have alerts of type ‘pstat’, triggered if a thread protocol status is not ‘up’ for 60 seconds, and also of type ‘ipque’ and ‘opque’, triggered if the queue depth is more than zero for 15 minutes.

            These seem to work very well, except if a process panics; all its threads go down but the alert is not triggered. Messages also build up in the recovery database destined for the affected threads, but no alerts are triggered for this either.

          • #56093
            Rick Brown
            Participant

              It would be helpful to know what the panic is.

              Can you let us know if you are geting a dbvista error?

            • #56094
              Ed Mastascusa
              Participant

                If you are specifically looking only for a Panic, here’s a thought for a workaround.

                If you’re on a *ix box you can detect one reasonably well with a script that greps the process’s log file for the string “PANIC: “.  We do this with a korn shell script that runs outside the engine as a cron job. Our script sends an email and we’re just using standard eo settings – I suppose its possible to have settings so that the “panic: ” string will never get into the log file.

                When you have a panic the exit_log file should also say something like “Engine Panic at blah blah blah”

              • #56095
                Michael Hertel
                Participant

                  You should set up an alert to watch for process status.

                  When the process panics, the alert will go off and you’ll know you have bigger issues on your hands.

                  -mh

                • #56096
                  Jeff Shields
                  Participant

                    Thanks for all the responses, I went for the process status alert in the end, this always triggers even if the process panics.

                Viewing 5 reply threads
                • The forum ‘Cloverleaf’ is closed to new topics and replies.