Alert based on error DB size?

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Alert based on error DB size?

  • Creator
    Topic
  • #48209
    David Gordon
    Participant

      We just had an issue with a thread receiving something in the range of a million invalid ACKs over a few hours, which it promptly dumped into the error database.  This had the result of bringing down our production site for about an hour and a half while we tried to clear the messages and bring the processes back online.

      We would like to create an alert that will monitor the error DB, but nothing in the Alert Configurator seems to allow us to do that.  I realize we can monitor error count on a per-thread basis, but that’s going to be a ton of work to do given the size of the site.

      Has anyone else tried doing an alert like this?

    Viewing 1 reply thread
    • Author
      Replies
      • #58012
        Mike Grieger
        Participant

          What’s your platform?  If UNIX, you can use something like the script I run on cron every so often (20 minutes for me).  If Windows, you could do something similar creating a batch file and calling it through scheduler.  Having a job check you error amount can be very valuable – for those times when something is put into production and isn’t working the way it is supposed to.

          Anyway, my script monitors our 4 prod sites, and each site is given a its own error threshold.  It emails our team if any of those thresholds is reached.

          #!/bin/ksh

          # Written by Mike G

          #     4/2002

          # Purpose: to have cron job run frequently and alert if anomilies within any production error database

          #

          # To get correct line counts, the dump of the db is grepped by pattern 0.0. which each engine message ID starts with

          setroot /hci/qdx5.3/integrator

          setsite prodsms

          prodsms_errors=`hcidbdump -e | grep 0.0. | wc -l`

          setsite prodmck

          prodmck_errors=`hcidbdump -e | grep 0.0. | wc -l`

          setsite prodtrans

          prodtrans_errors=`hcidbdump -e | grep 0.0. | wc -l`

          setsite prodepr

          prodepr_errors=`hcidbdump -e | grep 0.0. | wc -l`

          # Alert threshhold set to alert when prodsms has 10 errors or more, prodmck has 4 or more, prodtrans has 4 or more, prodepr has 4 or more

          if $prodsms_errors -lt 10 && $prodmck_errors -lt 4 && $prodtrans_errors -lt 4 && $prodepr_errors -lt 4

              then exit 1

          fi

          touch /hci/scripts/Database_Error_Alert_File

          echo “The criteria for alerting has been met for one of the following Error Databases:nn” >> /hci/scripts/Database_Error_Alert_File

          echo “$prodsms_errors Errors currently in prodsms error db      (alert at 10)n” >> /hci/scripts/Database_Error_Alert_File

          echo “$prodmck_errors Errors currently in prodmck error db      (alert at 4)n” >> /hci/scripts/Database_Error_Alert_File

          echo “$prodepr_errors Errors currently in prodepr error db       (alert at 4)n” >> /hci/scripts/Database_Error_Alert_File

          echo “$prodtrans_errors Errors currently in prodtrans error db    (alert at 4)nn” >> /hci/scripts/Database_Error_Alert_File

          echo “Check for validity of the errors in the appropriate process log or error log.nRemove the errors from the Db to avoid further alerts.n” >> /hci/scripts/Database_Error_Alert_File

          mail -s “Cloverleaf Error Database Alert” ieteam@meritcare.com < /hci/scripts/Database_Error_Alert_File rm /hci/scripts/Database_Error_Alert_File

        • #58013
          David Gordon
          Participant

            I should have mentioned that we are on AIX.  Anyhow, I figured we would end up doing something similar, but I was curious about something I might have been overlooking in the Alert tool.  My lack of shell scripting experience also weighed on that decision!

            Thanks for posting your script Mike, you saved me some typing this afternoon!

        Viewing 1 reply thread
        • The forum ‘Cloverleaf’ is closed to new topics and replies.