Messages queued; connection shows "up"

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Messages queued; connection shows "up"

  • Creator
    Topic
  • #49524
    Ariba Jones
    Participant

      We have had issues with several interfaces going to the same receiving system regarding message processing.  It appears that messages are getting queued up on the Cloverleaf side and the thread still shows “up”.  When I look on the receiving system’s side, everything looks fine there (the messages are not queued and the status shows “up”).

      I turn off the ob thread on Cloverleaf and then go stop and start the corresponding thread on the receiving system’s side.  I then bring the Cloverleaf side back up. When this is done, this works.

      This issue occurred this morning, so, I went to the recovery database while the messages were still queued on the Cloverleaf side.  All of the messages were in a state 11.  I expected to see one message in a state 14.  Does the fact that all of the messages for the ob thread were in a state 11 mean anything?

      Has anyone had something like this happening?  I am trying to find out what is causing the problem.  

      I have been asked to setup an alert to page the on-call person to know that the interface is not processing messages.  Has anyone setup an alert to do something like this?  I can’t figure out how I will setup the alert to know when to trigger without it being a false alarm.  Sometimes the thread may actually be “up” and backed up with messages, but it is processing.  This is mostly the ADT interface having this issue, so, I definitely don’t want to have anyone in an unnecessary panic.  I also need to get a solution to this as soon as I can.

      Thanks,

      Ariba Jones

    Viewing 4 reply threads
    • Author
      Replies
      • #62322
        James Cobane
        Participant

          Ariba,

          If you are using the standard recovery_33 procs, make sure that you DON’T have the ‘Await Replies’ set to -1 ; this would cause the thread to wait forever for a reply from the receiving system.  If ‘Await Replies’ is set to -1, then if the receiving system doesn’t ACK for some reason, the engine will wait forever without attempting to resend.  The symptoms you describe seem to fall along these lines; by bouncing the threads you cause the engine to send another message, then the receiving system is likely then replying.

          Hope this helps.

          Jim Cobane

          Henry Ford Health

        • #62323
          Ariba Jones
          Participant

            James,

            I checked my Await Replies and it is set to 80 for the outbound threads that this has occurred with.  Do you think I should increase this number?

            Thanks,

            Ariba

          • #62324
            James Cobane
            Participant

              Ariba,

              I think your timeout value is fine.  The next thing to check would be to see if you are actually get ACK’s back.  Look to see if the ‘Last Rd’ time on the thread correlates to the ‘Last Wt’ time on the thread.  If so, then take a look to see if you are getting AA (Acks) vs. AR/AE types (Naks) by turning up the EO on the outbound thread and looking through the log to see the Replies.

              Jim Cobane

              Henry Ford Health

            • #62325
              Ariba Jones
              Participant

                Jim,

                I checked the ‘Last Rd’ and ‘Last Wt’ time on one of the threads and it does correpsond.  I am going to look in the log file and let you know what I find.

                Thanks,

                Ariba

              • #62326
                Robert Milfajt
                Participant

                  Back in the day, I had to write a ksh script to run under cron.  It would run hcimsiutil on a thread and check the date/time last sent and the queue depth.  If the date/time last sent was greater then some threshold of seconds, and the depth was greater than some number, it would send an e-mail and and bounce the thread (which usually reset the connection).

                  That was at a former employer, and I do not have it handy to send.  I can tell you it was not a very big script.

                  This does not tell you what is causing the problem, but it certainly gets you by until you figure it out.

                  Hope this helps,

                  Robert Milfajt
                  Northwestern Medicine
                  Chicago, IL

              Viewing 4 reply threads
              • The forum ‘Cloverleaf’ is closed to new topics and replies.