Successful messages hung in Recovery DB

This topic has 7 replies, 5 voices, and was last updated 15 years, 2 months ago by Scott Folley.

Creator

Topic
April 23, 2010 at 3:59 pm #51720
Gena Gill
Participant
We have an issue where some messages are staying in the recovery database even after they have successfully transmitted. In most of the cases, we are sending the messages through a VPN tunned to a 3rd party. This happens on all of our messages going through the VPN tunnels, but rarely on other threads.

The message state is that the message was delivered OK, which is great, but I don’t need it clogging up the recovery DB. Where would I kill this message?

msgType : DATA

msgClass : PROTOCOL

msgState : OB delivered OK (14)

msgPriority : 5120

msgRecoveryDbState: 3

msgFlags : 0x8002

msgMid : [0.0.119645283]

msgSrcMid : [0.0.119645216]

msgSrcMidGroup : midNULL

msgOrigSrcThread : p2_adt_s

msgOrigDestThread : to_quantros

msgSrcThread : p2_adt_s

msgDestThread : to_quantros

msgXlateThread :

msgSkipXlate : 0

msgSepChars :

msgNumRetries : 0

msgGroupId : 0

msgDriverControl :

msgRecordFormat :

msgRoutes :

msgUserData :

msgStaticIsDirty : 0

msgVariableIsDirty: 0

msgTimeStartIb : 1272037745.871

msgTimeStartOb : 1272037745.903

msgTimeCurQueStart: 0.000

msgTimeTotalQue : 0.056

msgTimeRecovery : 1272037745.931

msgEoConfig : 0x0

msgData (BO) : 0x30000120

message
Creator

Topic

Viewing 6 reply threads

Author

Replies
- April 23, 2010 at 4:06 pm #71437
  James Cobane
  Participant
  Gena,
  
  You need to take a look at the configuration for the thread to see if there are any procs employed that are keeping those messages around in the recovery database. Also check on how/if you are handling replies. I suspect you may have some of the recovery procs employed, but maybe missing one of the procs (i.e. kill_ob_save) or something similar. Without any procs employed, the engine should be cleaning up after itself.
  
  Jim Cobane
  
  Henry Ford Health
- April 23, 2010 at 7:10 pm #71438
  Russ Ross
  Participant
  Many years ago on an old legacy interface I had clogged state 14 messages that I had to manually do
  
  hcidbdump -r -s 14
  
  then get message Id and
  
  hcidbdump -r -m messageID -D
  
  I uncovered that this happened when this particualr interface sent a message that was too large in size for the limitation of the foreign system’s input buffer allocated to the listener.
  
  The listener was written in FORTRAN and used COMMON statements to overlap memory but failed when those memory boundaries were exceeded and stepped all over the other memory areas.
  
  This occurred often enough that the receiving system owner even wrote a mock interface that simply deleted the one message and he would turn it on when the cloverleaf alert for que depth got triggered, let it delte that one message and then turn the problematic listener back on.
  
  This interface has been replaced by a new system and I no longer have this problem anymore but wanted you to know there are forces such as this that can make it appear Cloverleaf has a problem when it doesn’t.
  
  You might want to dump the state 14 message to a file and see if you notice if they are larger than normal when you see this behavior because we certainly did once upon a time.
  
  I don’t imagine this is all that relevant but this interface wasn’t typical TCP/IP MLP but was straight TCP/IP binary length encoded protocol, using 4 bytes I think but might of been 8 bytes not sure.
  
  Russ Ross
  RussRoss318@gmail.com
- April 23, 2010 at 8:00 pm #71439
  Gena Gill
  Participant
  I may have the issue resolved now. Jim’s reply helped me, specifically, “see if there are any procs employed that are keeping those messages around in the recovery database”.
  
  Normally, we used “SendOK_save” in the Send OK Procs field on the Outbound tab, and it works just fine. But on the ones where we’re sending through a VPN tunnel and the message sits in the Recovery DB longer before being sent, especially when we get a backlog, they aren’t clearing. So, I removed the SendOK_save, and it seems to be working OK.
  
  At least, I don’t have the messages sticking around after they’ve been sent, the 3rd party is receiving their messages, and if there’s a problem, I can re-send them from the save file. I’m going to leave it off just this one interface, then if everything goes OK, then I’ll consider removing it from the other VPN tunneled interfaces.
- April 30, 2010 at 8:40 pm #71440
  Scott Folley
  Participant
  You would only use SendOK_save if you are awaiting an acknowledgement. If you are, then you should have inbound-replies processing set up on the outbound tab.
  
  This also depends heavily on which version of Cloverleaf you have because reply processing is “built-in” to 5.6. This means that checking Await Replies will auto-magically call the equivalent of SendOK_save. In this case you should have check_ack or its equivalent in the TPS Inbound Reply stack because that will kill the message when a valid reply is received.
  
  Hope that helps.
- April 30, 2010 at 8:46 pm #71441
  Gena Gill
  Participant
  That was definitely the magic combination. The vendor did not require an ACK, and with this going over the VPN tunnel it didn’t make sense to await the reply. I had unchecked await reply, and had tons of errors, and then realized I should remove the sendOK_save.
  
  I’ve monitored this for a week now, and they are getting their messages just fine, and I’ve only had the odd single message here or there that had a problem, so I’m going to do this for some of our other interfaces that go through these tunnels.
- April 30, 2010 at 10:00 pm #71442
  Charlie Bursell
  Participant
  Without some sort of ACK you will surely lose messages. This is what we affectionaly call a “Send and Pray” protocol. 😀
  
  The primary purpose of an ACK is not for the warm fuzzy feeling that the message was delivered but rather for flow control. This means that the sender cannot send messages faster that the receiver can accept them.
  
  If you send a lot of messages in a short period of time I believe you will lose some.
  
  Just my $0.02 worth
- May 3, 2010 at 2:21 pm #71443
  Scott Folley
  Participant
  Though Charlie certainly doesn’t need me to back him up, I strongly echo his sentiment. The fact that you are going across a VPN will actually increase the chances of losing messages because it is possible for the connection to remain open when there is nothing on the other end to receive the message. What will end up happening to you is that you will show that you sent everything because it will be in your outbound SMAT file and yet the receiving system will not have received it. They will be all over you for not sending the message and you will not have an acknowledgement from them to back you up when you tell them it was sent. The folks on this forum know that a message that appears in your Outbound SMAT file WAS SENT but you will not likely be dealing with someone on this forum.
Author

Replies

Viewing 6 reply threads

The forum ‘Cloverleaf’ is closed to new topics and replies.