Message control ID validation

This topic has 7 replies, 5 voices, and was last updated 11 years, 2 months ago by Jeff Dinsmore.

Creator

Topic
May 27, 2014 at 6:42 pm #54229
Jeff Dinsmore
Participant
I’m using the cl_check_ack proc for handling ACKs.

Recently, I notice that messages and associated ACKs sometimes seem to be out-of-sync.

Further investigation shows that cl_check_ack apparently doesn’t validate the message control ID of the sent message against the one that is returned in the ACK.

This works fine until there’s a timeout and a message is resent – and the receiver eventually returns ACKs for all of the messages.

Something like this:

–> message 1 (times out)

–> message 1 (resend)

ACK(1) <-- –> message 2

ACK(1) response to resent message 1) <-- –> message 3

ACK(2) <-- –> message 4

ACK(3) <-- When this happens, there seems to be an extra ACK in the hopper that’s used for the wrong message. This can have the effect of sending new messages before the receiver is ready. Am I missing something? Is this normal/acceptable?

Jeff Dinsmore
Chesapeake Regional Healthcare
Creator

Topic

Viewing 6 reply threads

Author

Replies
- May 28, 2014 at 1:14 am #80651
  Charlie Bursell
  Participant
  You hurt me Jeff. That is my proc 😀
  
  You should realize that none of the furnished procs are meant to be shrink wrap and work for all occasions. The proc you refer to is based on the premise that you send a message and get an ACK and the engine will not send another until you receive the ACK.
  
  I would look at your setup. There should be no way the vendor should receive a different message before the previous one is ACK’ed or discarded (KILLREPLY). If it times out the same message, not a different message, should be sent (PROTO). You may want to adjust the timeout.
  
  Feel free to modify any of the furnished procs to meet your specific needs
- May 28, 2014 at 3:52 pm #80652
  Jeff Dinsmore
  Participant
  If I’d meant to hurt you, I’d have disparaged your parentage – or worse – I’d have accused you of being a lover of Microsoft or something like that ;o)
  
  I realize that the Cloverleaf procs are not one-size-fits all, but thought that validation of Message Control ID could be assumed – never bothered to question it.
  
  In cl_check_ack’s current state, the problem with changing a timeout – aside from waiting forever – is that the receiving system always has the opportunity to go past that number – whatever it is.
  
  Most often, the receiver eventually replies to that first message. So, I’ll either discard the timed-out message and send the next one or resend it some number of times or until I receive an ACK. Either way, I’m no longer expecting an ACK for the timed-out message. When that first ACK arrives, it’s applied to the wrong message.
  
  Perhaps my inexperience with Cloverleaf is showing, but I was surprised by this discovery.
  
  Have any of the rest of you in the CL community solved this problem? If so would you be willing to share your solution?
  
  Jeff Dinsmore
  Chesapeake Regional Healthcare
- May 28, 2014 at 6:59 pm #80653
  Jim Kosloskey
  Participant
  Jeff,
  
  Generally speaking finding the ‘happy’ number for the timeout period handles the issue.
  
  You are right that the receiving system could at any time exceed that ‘happy’ number but if the number is reasonable and arrived at in a proper fashion that should happen rarely.
  
  Another option is to track (via a tcl proc) the number of times a resend for the same message has occurred in case there is an excessive wait and take some action if a desired threshold is reached.
  
  We have not had any issue once we got the wait time set properly. At my previous place we did use ‘wait forever’ for an acknowledgment and used Alerts to indicate when we got stuck waiting too long (excessive queue depth primarily). Wait forever definitely makes sure you won’t resend a message and is useful when it is absolutely imperative the receiving system never receives a resend (at least automatically).
  
  You can try to match up Control IDs but how will you be able to tell if the next acknowledgment is likely to be the one you are looking for or if it has passed unless you have some sort of sequencing exchange – and that is a whole other can of worms.
  
  email: jim.kosloskey@jim-kosloskey.com 30+ years Cloverleaf, 60 years IT – old fart.
- May 29, 2014 at 2:30 pm #80654
  Jeff Dinsmore
  Participant
  The resending is not necessarily undesirable. In the case of ADT, we want to make sure that a given message is delivered even if we time out waiting for a response.
  
  When matching message control IDs (MCID) you simply need to discard any ACK with an MCID other than the one you’re waiting for.
  
  Let’s take my example:
  
  –> message 1.1 (times out)
  
  –> message 1.2 (resend)
  
  ACK(1.1) <–
  
  –> message 2
  
  ACK(1.2) ( response to resent message 1.2) <–
  
  –> message 3
  
  ACK(2) <–
  
  –> message 4
  
  ACK(3) <–
  
  In this situation, we would accept the first ACK(1.1) for the resent message 1.1. Then, when we send message 2, we receive the second ACK(1.2) which would be discarded since its MCID is not the one we’re waiting for (just like the Star Wars droids). We would then receive ACK(2) and accept it as ACK for message 2. At that point, we’re back in sync.
  
  This is where my expertise evaporates…
  
  How would we modify cl_check_ack to discard an ACK with a non-matching MCID and continue to wait for an ACK with a matching MCID?
  
  In this excerpt from cl_check_ack, the “good” ACK case kills both the received ACK ($mh) and the sent message ($my_mh).
  
  Code: AA – CA { # Good ACK – Clean up set send_cnt 0 ;# Init counter return “{KILLREPLY $mh} {KILL $my_mh}” }
  
  If we kill just the reply – something like this (assuming we build code to set matching_mcid properly):
  
  Code: AA – CA { # Good ACK – Clean up if { ! $matching_mcid } { return “{KILLREPLY $mh}” } else { set send_cnt 0 ;# Init counter return “{KILLREPLY $mh} {KILL $my_mh}” } }
  
  Will that have the desired effect of discarding the superfluous ACK and moving on to the next one – or will it just leave $my_mh adrift somewhere in the ether?
  
  Jeff Dinsmore
  Chesapeake Regional Healthcare
- May 29, 2014 at 6:42 pm #80655
  Russ Ross
  Participant
  What you are describing is a significant reason I prefer batch charges over real-time charges.
  
  Now that we are going to EPIC and will be required to do real-time charges, I will have to be watchful to warn about how real-time resends can cause duplicate charges in a real-time integration.
  
  This in my opinion makes it necessary to control and check message sequencing for real-time charge integrations.
  
  We seemed to have gotten away with not worrying about this for our other real-time integrations.
  
  I did come up with a TPS proc to trigger an alert notification on resend count, but it is mostly used because it is more proactive than queue depth alerts and never generates a false alert, and not for out of sequence mismatch events.
  
  Having said that I’m starting to realize a simialr type TPS proc could perhaps be put in place to detetect out of sequenced events, but not sure what action will correct problem based on your description always being one off perpetually after a resend.
  
  Russ Ross
  RussRoss318@gmail.com
- May 30, 2014 at 5:29 pm #80656
  Jason Alexander
  Participant
  Jeff, contact me at jalex@u.washington.edu and I will send you a copy of the modified cl_check_ack that we use. We worked with Charlie 15 years ago to add this functionality back in 3.3. and have been carrying it forward since then including the upgrade to the new ack handling routines.
  
  For charges we still strongly recommend using extremely long or even infinite timeouts as knowing when you have sent two copies of a message due to timeout doesn’t really save you anything once Epic has processed both copies of the message anyway. As a result I would not call even our code a solution for charges.
- May 30, 2014 at 5:41 pm #80657
  Jeff Dinsmore
  Participant
  Thanks for offering to share, Jason.
  
  With charges, I never resend. If there’s a timeout, I’ll generally notify via email that a potential exists for a missing charge.
  
  Usually, however, the message is actually received by the HIS system, it’s just slow to ACK.
  
  For some of my interfaces, I’ll keep a database of message IDs that have been sent to a particular destination to prevent any potential resends.
  
  For example, we morph reports from one of our cardiology systems into charges. If we receive another copy of the report due to an update or amendment, we will check the DB, see that it’s already been charged, and will not send the charge again.
  
  Jeff Dinsmore
  Chesapeake Regional Healthcare
Author

Replies

Viewing 6 reply threads

The forum ‘Cloverleaf’ is closed to new topics and replies.