Connection bounce is resetting alerts

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Connection bounce is resetting alerts

  • Creator
    Topic
  • #50458
    John Zalesak
    Participant

    I am new to alerts but there are some setup already in our system.  The employees who set them up are no longer here so I am on my own.

    Here is what I want to do.

    I want to have an alert on last received.  I want to bounce the thread if it did not get something in the last 30 minutes.  If it did not get something in 95 minutes, I want an email.

    The idea is bounce a couple of times, if still no messages, alert someone.

    My problem is that I think the short time alert bounce is resetting the longer time alert begin time and it never goes off.

    For example: here is what I set up in a test site:

    First Bounce Alert

    Type: last receive

    Source: Inbound_side

    Source Count: any

    Comparing: >= 60

    Duration: once

    Second Email Alert

    Type: last receive

    Source: Inbound_side

    Source Count: any

    Comparing: >= 120

    Duration: once

    When I run it,  the 60 second alert continually goes off but the 120 second alert never does.

    Any ideas on how to bring my idea into reality??

    Thanks in advance for your responses.

Viewing 13 reply threads
  • Author
    Replies
    • #66158
      Robert Kersemakers
      Participant

      Hi John,

      It looks that way, yes. The first alert always fires, so maybe the second one doesn’t because the ‘trigger’ for this thread is reset.

      Have you tried putting the second alert before (ie above) the first alert in the Alert Configuration file? It should cause the second alert to fire (if >120) and if not, the first alert should still fire (if >60).

      Just a wild guess here…

      Zuyderland Medisch Centrum; Heerlen/Sittard; The Netherlands

    • #66159
      John Zalesak
      Participant

      Robert

      Thanks for your idea.  I will give it a try and let you know.

    • #66160
      John Zalesak
      Participant

      Robert,

      I tried your idea first.

      Alerts in this order

      NUMBER 1

      Type: last receive

      Source: Inbound_side

      Source Count: any

      Comparing: >=75

      Duration: Once

      NUMBER 2

      Type: last receive

      Source: Inbound_side

      Source Count: any

      Comparing: >=60

      Duration: Once

      -> All I ever got was Number 2 (60 secs)

      Then I tried

      NUMBER 1

      Type: last receive

      Source: Inbound_side

      Source Count: any

      Comparing: >=1

      Duration: nsec 75

      NUMBER 2

      Type: last receive

      Source: Inbound_side

      Source Count: any

      Comparing: >=1

      Duration: nsec 60

      The only one I got to fire was NUMBER 2 (61 sec)

      I am really grabbing a straws here.

      Can anyone point me in the correct direction??

      Thanks!

    • #66161
      John Zalesak
      Participant

      Here is a little more info.  I checked my hcimonitord file.  Looks OK to me.  The longer alert (Alert #12) never fires.

      Any comments or assistance is greatly appreciated.

      [aler:aler:INFO/0:  hcimonitord:11/13/2008 11:15:02] New alert #12:

      {VALUE lastr} {SOURCE {Inbound_side }} {MODE actual} {WITH -2} {COMP {>= 1}} {FOR {nsec 75}} {WINDOW {* * * * * *}} {HOST {}} {ACTION {{exec {IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – %A. Connection bounced.”}}}}

      [aler:aler:INFO/0:  hcimonitord:11/13/2008 11:15:02] can’t read “HCISITEDIR”: no such variable

      [aler:aler:INFO/0:  hcimonitord:11/13/2008 11:15:02] New alert #13:

      {VALUE lastr} {SOURCE {Inbound_side }} {MODE actual} {WITH -2} {COMP {>= 1}} {FOR {nsec 60}} {WINDOW {* * * * * *}} {HOST {}} {ACTION {{exec {IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – %A. Connection bounced.”}}}}

      [aler:aler:WARN/0:  hcimonitord:11/13/2008 11:16:07] Alert #13 triggered.

      alert: {VALUE lastr} {SOURCE {Inbound_side }} {MODE actual} {WITH -2} {COMP {>= 1}} {FOR {nsec 60}} {WINDOW {* * * * * *}} {HOST {}} {ACTION {{exec {IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – %A. Connection bounced.”}}}}

      action: IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – Thread last inbound message received time of Inbound_side has been more than or equal to 1for 60 seconds. Connection bounced.” &

      [aler:aler:WARN/0:  hcimonitord:11/13/2008 11:16:07] Completed Cascade Actions

      [aler:aler:WARN/0:  hcimonitord:11/13/2008 11:17:22] Alert #13 triggered.

      alert: {VALUE lastr} {SOURCE {Inbound_side }} {MODE actual} {WITH -2} {COMP {>= 1}} {FOR {nsec 60}} {WINDOW {* * * * * *}} {HOST {}} {ACTION {{exec {IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – %A. Connection bounced.”}}}}

      action: IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – Thread last inbound message received time of Inbound_side has been more than or equal to 1for 60 seconds. Connection bounced.” &

      [aler:aler:WARN/0:  hcimonitord:11/13/2008 11:17:22] Completed Cascade Actions

      [aler:aler:WARN/0:  hcimonitord:11/13/2008 11:18:37] Alert #13 triggered.

      alert: {VALUE lastr} {SOURCE {Inbound_side }} {MODE actual} {WITH -2} {COMP {>= 1}} {FOR {nsec 60}} {WINDOW {* * * * * *}} {HOST {}} {ACTION {{exec {IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – %A. Connection bounced.”}}}}

      action: IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – Thread last inbound message received time of Inbound_side has been more than or equal to 1for 60 seconds. Connection bounced.” &

      [aler:aler:WARN/0:  hcimonitord:11/13/2008 11:18:37] Completed Cascade Actions

      [aler:aler:WARN/0:  hcimonitord:11/13/2008 11:19:53] Alert #13 triggered.

      alert: {VALUE lastr} {SOURCE {Inbound_side }} {MODE actual} {WITH -2} {COMP {>= 1}} {FOR {nsec 60}} {WINDOW {* * * * * *}} {HOST {}} {ACTION {{exec {IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – %A. Connection bounced.”}}}}

      action: IMalerts_jtz.sh sitejtz tst_pro Inbound_side IM_Alert_Last_Recd “$HCISITEDIR – Thread last inbound message received time of Inbound_side has been more than or equal to 1for 60 seconds. Connection bounced.” &

      [aler:aler:WARN/0:  hcimonitord:11/13/2008 11:19:53] Completed Cascade Actions

    • #66162
      Michael Hertel
      Participant

      Yes you are grabbing.

      By bouncing you are resetting the stats on the thread so

      your email will never fire.

    • #66163
      John Zalesak
      Participant

      So what is the trick.

      I would like to have it try and fix it self a few times before I get woken up at 3 am.

      Is there another way to come at this problem???

    • #66164
      Michael Hertel
      Participant

      You could write an alert script that reads/writes to a file

      which you could extract previous stats.

      If it were me, I’d email with the first alert and when I see multiple back to back emails I would know there is an issue.

      You might even be able to have the alert script fire off and look again in x amount of time to see if it fixed itself.

      I’ll post one of our scripts in a minute.

    • #66165
      Michael Hertel
      Participant

      Here is part of our script:

      Look at the outbound qdepth portion to see the sleep statement.



      #!/hci/root/bin/hcitcl
      set ts [clock format [clock seconds]]

      [code]

      #!/hci/root/bin/hcitcl
      set ts [clock format [clock seconds]]

    • #66166
      John Zalesak
      Participant

      Michael,

      Thanks for the script.  I go to Tcl class next week so maybe it will make a little more sense when I get back…. If I pay attention!

      Currently, we do get the email on the first alert, but after being woken up at 3am for a couple of days only to say, its the first one go back to bed.  You starting thinking about doing something different.

      Our 1st idea (simple) is to

      Set alert to bounce if none received in 45 minutes

      Set alert to send e-mail to blackberry (and get me out of bed) if the tread is opening for 30 minutes.

      Hopefully the bounce fixes the problem.  If after the bounce, a connection can not be made in 30 minutes, its time to get woken up.

      Our 2nd idea (more complex) is to

      Set alert to bounce in none received in say 15 minutes

      The script that bounces should write a time stamp to a log file.

      The script will also read the log file, do an analysis – say maybe 4 bounces in a row-, and if need be -> send an email to my blackberry to wake me up.

      Any comments / suggestions would be greatly appreciated.

    • #66167
      Michael Hertel
      Participant

      Good luck with the class  ðŸ˜€

    • #66168
      John Zalesak
      Participant

      thanks

    • #66169
      James Cobane
      Participant

      John,

      You could utilize the counter functions provided by Cloverleaf (i.e. CtrNextValue, CtrResetValue ) within a tcl script to keep a counter of how many times the first alert fires.  You could run this script on your 30-minute Alert; if the counter value is >= your desired number (i.e. 3), then you could reset the counter and trigger the e-mail within the script via an ‘exec’ command.

      Also, I don’t believe the order of the alerts has any effect; it’s all based on the defined conditions.  Order would likely only play a role if two alerts had the same condition defined.

      Hope this helps.

      Jim Cobane

      Henry Ford Health

    • #66170
      John Zalesak
      Participant

      James – Thanks for you comments

      I agree, in my testing, the order of the alerts has no effect.

      Thanks for the counter functions in Cloverleaf.  I had no idea they were there.  Where do I find out about them???  Are there counters that will tell us no messages over the last 3 bounces (Alerts) ???

      We thought of a similar idea with our own counter stored in a file.  Our problem was how do you differentiate between 3 bounces in the last 90 minutes versus 3 bounce in the last week.  

      Thanks again.

    • #66171
      James Cobane
      Participant

      John,

      The counter functions are documented in the ‘Reference Manual’ under Tcl extensions, Counter Commands.  The counter commands allow you to create your own counter files for whatever use you desire (i.e. creating/maintaining sequence numbers, etc).  To determine if the count is for the last 90 minutes vs. the past week, you’ll probably have to store the date/time info off into a file to use for comparison.

      Jim Cobane

      Henry Ford Health

      P.S.  Have fun in your tcl class!

Viewing 13 reply threads
  • The forum ‘Cloverleaf’ is closed to new topics and replies.

Forum Statistics

Registered Users
5,126
Forums
28
Topics
9,296
Replies
34,439
Topic Tags
287
Empty Topic Tags
10