Wait Reply ACK Latency

Clovertech Forums Cloverleaf Wait Reply ACK Latency

  • Creator
    Topic
  • #120163
    Timothy O’Donnell
    Participant

      Good morning. Using Cloverleaf 20.1 here. We’re having an issue at my institution with an outbound thread to a vendor wherein often – but not always – the outbound queue backs up because “timed out while awaiting replies on thread.” The vendor confirmed they are sending the ACK immediately upon receiving our message and our network team confirms we are receiving the ACKs through the firewall at the time expected. One pattern I do notice from the logs is that the ACK will come in directly after the engine resends the OB message so it’s out of the waiting state milliseconds prior to the ACK coming in. Then that cycle repeats over and over. Sometimes it takes a few minutes, sometimes over an hour of resends and then finally it just works with no further intervention.

      The outbound thread is setup as pdl-tcpip, Outbound Only, Await Replies, Timeout 45. We’ve increased the Reply Timeout to 60 and 120 with no difference. We’ve updated the mlp_tcp.pdl to 60 seconds with no difference. The delayed ACK doesn’t happen for every
      outbound message but does happen often throughout the day, probably 1/3 of total messages sent outbound from this thread.

      Is there anything else we can do to try and resolve this issue within Cloverleaf? Or at least diagnose the issue? We’re trying to get a WireShark trace of the ACK once it’s through the firewall to see what’s going on but we don’t have issues of this kind with ANY of our other hundred plus outbound threads. The vendor is claiming it’s not them. Our network team is saying it’s not a VPN issue. We’re unsure of how to proceed because we don’t think it’s a Cloverleaf issue either. Any help would be appreciated. Thanks!

    Viewing 2 reply threads
    • Author
      Replies
      • #120164
        Keith McLeod
        Participant

          Are you receiving in your “No Match, No More Phrases to Try”?  If so, this occurs as part of the PDL.  Caused by an improper wrapper.  <\x0B><Your Ack Msg><\x1C><\x0D>.

          Have you upped the engine noise to see what you are receiving?

          I use an eoalias of enable_pdl with

          ENABLE pdl * * *

          This shows the message in hex so you can see the wrapper in your log….  If the delayed ones don’t follow the pattern, they could be timing out before being processed.

          Should see the wrapper in Wireshark as well.

          Hope this helps…

          • #120166
            Timothy O’Donnell
            Participant

              I’ve upped the engine noise and the incoming ACKs have the proper wrapper as you mentioned.

               

          • #120169
            Charlie Bursell
            Participant

              Are you saying you can see the incoming ACK in the logs but not receiving it?  Weird!

              If above not true, then turn up timeout to something like 300 and see if the ACK is ever received in the logs.  If not, it is a vendor problem.

            • #120170
              Robert Kersemakers
              Participant

                I have a feeling the connection between CL and the other system is dropped after the message is sent, so the ACK can’t be send by the other system. As soon as CL opens the connection again to re-send the message, then the ACK from the previous message is received, as you noticed.
                Maybe the other system is closing the connection before the ACK is sent, in certain cases. Because the ACK needs be sent over the same connection as the message was sent. So could be a vendor issue.
                Maybe something else (firewall/VPN) is closing the connection. But this would be weird; normally this is after a certain period of inactivity.
                Have you tried turning up the engine noise to see the complete log, including tcpip? It should tell you whether the connection is closed. Huge logs, but could be very helpful.

                Something else you could try in CL is using the native tcpip driver instead of the pdl-tcpip. Choose protocol:tcpip and in Properties fill in host and port. Choose Type ‘Encapsulation’ and with Configure… set it to MLLP. Does the same as the mlp_pdl.

                Zuyderland Medisch Centrum; Heerlen/Sittard; The Netherlands

                • #120173
                  Timothy O’Donnell
                  Participant

                    This 100% seems like what is happening. What I don’t understand is why it happens at all but then resolves itself? Sometimes it takes multiple resends and sometimes it’s just one or two. We’ve seen it stuck up to an hour doing resends.

                    We’ve tried switching the thread to tcpip with no change in behavior.

                    I’ve got a handful of logs from various times this has happened. There’s not much in the way of issues in between messages – it just shows the queueing of the OB message, waiting for replies, OB-Data queue has NO WORK or # msgs, depending then processing SOCKET event for the ACK coming in (albeit after the wait reply time has elapsed.)

              Viewing 2 reply threads
              • You must be logged in to reply to this topic.