Good morning. Using Cloverleaf 20.1 here. We’re having an issue at my institution with an outbound thread to a vendor wherein often – but not always – the outbound queue backs up because “timed out while awaiting replies on thread.” The vendor confirmed they are sending the ACK immediately upon receiving our message and our network team confirms we are receiving the ACKs through the firewall at the time expected. One pattern I do notice from the logs is that the ACK will come in directly after the engine resends the OB message so it’s out of the waiting state milliseconds prior to the ACK coming in. Then that cycle repeats over and over. Sometimes it takes a few minutes, sometimes over an hour of resends and then finally it just works with no further intervention.
The outbound thread is setup as pdl-tcpip, Outbound Only, Await Replies, Timeout 45. We’ve increased the Reply Timeout to 60 and 120 with no difference. We’ve updated the mlp_tcp.pdl to 60 seconds with no difference. The delayed ACK doesn’t happen for every
outbound message but does happen often throughout the day, probably 1/3 of total messages sent outbound from this thread.
Is there anything else we can do to try and resolve this issue within Cloverleaf? Or at least diagnose the issue? We’re trying to get a WireShark trace of the ACK once it’s through the firewall to see what’s going on but we don’t have issues of this kind with ANY of our other hundred plus outbound threads. The vendor is claiming it’s not them. Our network team is saying it’s not a VPN issue. We’re unsure of how to proceed because we don’t think it’s a Cloverleaf issue either. Any help would be appreciated. Thanks!