5.7 alert problem

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf 5.7 alert problem

  • Creator
    Topic
  • #52100
    David Barr
    Participant

      We had an alert set up on 5.5 that isn’t working on 5.7.  The alert is for specific threads when the error count changes by 1 or more.  The alert execs a script that e-mails the error contents to an analyst, then it deletes the messages from the error database.  The problem is that the alert only fires one time.  If new errors come in, the alert doesn’t get fired.  Any ideas?

      {ALERT

         { VALUE errs }

         { SOURCE {chartmaxx_adt_out } }

         { MODE delta }

         { WITH -2 }

         { COMP {>= 1} }

         { FOR once }

         { WINDOW {* * * * * *} }

         { ACTION {

             { exec {/quovadx/qdx5.7/integrator/scripts/errordb_alert.sh live2 “%A” somebody@valleymed.org “%F”} }

         } }

      }

    Viewing 12 reply threads
    • Author
      Replies
      • #73066
        Robert Kersemakers
        Participant

          Hi David,

          I have no experience with alerts for error count changes, so can’t help you there.

          But my first reaction is to try to ‘touch’ your alert file (normally ‘default.alrt’). This way the monitor daemon will see that a ‘new’ alert file and will start a new cycle of monitoring. (Not very well explained, but I hope you get my drift.)

          Zuyderland Medisch Centrum; Heerlen/Sittard; The Netherlands

        • #73067
          Michael Hertel
          Participant

            I’d like to suggest that the alert fired at 1 and stayed true.

            The alert condition needs to be cleared before firing again.

            You probably have to zero out the statistics to get the error count back to zero.

            Also, you have it set to fire once?

          • #73068
            David Barr
            Participant

              Touching the alert file didn’t do anything.  Maybe I need to touch it from with my alert script.

              I had the alert duration set to “once”.  It seems like zeroing the statistics might cause other issues.  I’m going to try to set the alert duration to 1 minute and see if that fixes anything.

            • #73069
              Keith McLeod
              Participant

                You may want to set repeat in 5.7 and a maximum.  I will check my setting tomorrow since I use this alert as well.  I know it is working in 5.7.

              • #73070
                Chris Williams
                Participant

                  Once an alert fires, you must touch the file to enable Cloverleaf to reset the alert, so it will recognize that alert condition when it occurrs again. Clearing statistics doesn’t have anything to do with this.

                  We use a cron job to touch the file hourly, so any alert conditions not cleared by then will re-fire.

                • #73071
                  Michael Hertel
                  Participant

                    Not to hijack the discussion but if you don’t clear the statistics and the thread has an error count, and your alerts run say from 8am to 11pm, then every morning when the alert wakes back up at 8am, the engine will fire the alert again.

                  • #73072
                    Ted Mui
                    Participant

                      We had the same issue when we converted our 5.5 Alerts to 5.7

                      We changed our alert time windows from “by event time” to “by range”

                      So if you vi the default.alrt file you will see the paramenter change from { WINDOW * * * * } to { WINDOW */*/*/* }. Hope this works for you.

                    • #73073
                      David Barr
                      Participant

                        I’ve changed the alert type from “error count” to “error database”, removed the “delta” and changed the time window to a range.  It seems to be working ok now.

                        While testing this, I discovered something else interesting.  I managed to crash the monitor daemon a few times when I had alerts set up with a “notify” action.  I wasn’t aware of this, but it looks like this will always try to run an “hciguimsg” command, even if you don’t have an X server set up.  And if the DISPLAY variable that was set when you start the monitor daemon isn’t pointed at something usable, the monitor daemon sometimes crashes or sometimes freezes (at least for me).

                      • #73074
                        David Barr
                        Participant

                          Ok, maybe my hcimonitord crashes have nothing to do with hciguimsg.  I just had a crash when the DISPLAY was set and hciguimsg ran successfully.

                        • #73075
                          Jared Parish
                          Participant

                            I seem to have the same issue as you at two of my clients.  I haven’t yet figured out the cause.  The monitorD correctly parses the alerts file, but doesn’t crash until the alerts fires.  And whats frustrating, is if I move this alert file to their test box and try to reproduce, it works fine.

                            Both servers are:

                            CL 5.7 (ones Rev1 the other Rev2)

                            Redhat 5.3

                            - Jared Parish

                          • #73076
                            John Mercogliano
                            Participant

                              The crashes might be the monitord log files getting to large.  We had this same problem when we moved to 5.7.

                              I had to add

                              Code:

                              hcicmd -p hcimonitord -t d -c “cycle”


                              for each of our sites when we cycle our log and smat files to ensure the monitord logs are cycled also.  Under 5.2 at least only the actual command received was written to the log but now the data that is returned is also written to the log so with alerts firing it can get large very quickly.

                              John Mercogliano
                              Sentara Healthcare
                              Hampton Roads, VA

                            • #73077
                              Sallie Turner
                              Participant

                                We recently moved from version 5.3 to 5.7 sp2 and we had to do a lot of work on our alerts to get them to work. The biggest problem was getting an alert to run once a day at a specified time. We have the same alert programmed on 7 different sites, and on some of them it would run and reset, and on some it wouldn’t. Reloading the alerts with the Site Daemons gui would sometimes result in them running that night, but sometimes it wouldn’t, only actually stopping and starting the monitor daemon was sure to make them run at least once, but even then, some would reset and some would not.

                                One thing that has seemed to be successful is to use the repeating field. If you tell it to repeat every 1440 minutes it runs once a day at the same time and it doesn’t require that it gets reset.

                              • #73078
                                Bob Richardson
                                Participant

                                  One and All,

                                  We have discovered in moving from 5.6R2 to 5.7R2 that any “ampersand” (&) that is exec type alerts that you have directed to run in the background need to be removed – they error and you will see “failed” to trigger in the hcimonitord logs.   These will not execute.

                                  Following the discussion about the repeat option closely.

                                  Thanks to all who post on alert issues for 5.7R2.

                                  Happy Holidays!

                              Viewing 12 reply threads
                              • The forum ‘Cloverleaf’ is closed to new topics and replies.