Successful messages hung in Recovery DB

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Successful messages hung in Recovery DB

  • Creator
    Topic
  • #51720
    Gena Gill
    Participant

      We have an issue where some messages are staying in the recovery database even after they have successfully transmitted.  In most of the cases, we are sending the messages through a VPN tunned to a 3rd party.  This happens on all of our messages going through the VPN tunnels, but rarely on other threads.

      The message state is that the message was delivered OK, which is great, but I don’t need it clogging up the recovery DB.  Where would I kill this message?

      msgType           : DATA

         msgClass          : PROTOCOL

         msgState          : OB delivered OK (14)

         msgPriority       : 5120

         msgRecoveryDbState: 3

         msgFlags          : 0x8002

         msgMid            : [0.0.119645283]

         msgSrcMid         : [0.0.119645216]

         msgSrcMidGroup    : midNULL

         msgOrigSrcThread  : p2_adt_s

         msgOrigDestThread : to_quantros

         msgSrcThread      : p2_adt_s

         msgDestThread     : to_quantros

         msgXlateThread    :

         msgSkipXlate      : 0

         msgSepChars       :

         msgNumRetries     : 0

         msgGroupId        : 0

         msgDriverControl  :

         msgRecordFormat   :

         msgRoutes         :

         msgUserData       :

         msgStaticIsDirty  : 0

         msgVariableIsDirty: 0

         msgTimeStartIb    : 1272037745.871

         msgTimeStartOb    : 1272037745.903

         msgTimeCurQueStart: 0.000

         msgTimeTotalQue   : 0.056

         msgTimeRecovery   : 1272037745.931

         msgEoConfig       : 0x0

         msgData (BO)      : 0x30000120

         message

    Viewing 6 reply threads
    • Author
      Replies
      • #71437
        James Cobane
        Participant

          Gena,

          You need to take a look at the configuration for the thread to see if there are any procs employed that are keeping those messages around in the recovery database.  Also check on how/if you are handling replies.  I suspect you may have some of the recovery procs employed, but maybe missing one of the procs (i.e. kill_ob_save) or something similar.  Without any procs employed, the engine should be cleaning up after itself.

          Jim Cobane

          Henry Ford Health

        • #71438
          Russ Ross
          Participant

            Many years ago on an old legacy interface I had clogged state 14 messages that I had to manually do

            hcidbdump -r -s 14

            then get message Id and

            hcidbdump -r -m messageID -D

            I uncovered that this happened when this particualr interface sent a message that was too large in size for the limitation of the foreign system’s input buffer allocated to the listener.

            The listener was written in FORTRAN and used COMMON statements to overlap memory but failed when those memory boundaries were exceeded and stepped all over the other memory areas.

            This occurred often enough that the receiving system owner even wrote a mock interface that simply deleted the one message and he would turn it on when the cloverleaf alert for que depth got triggered, let it delte that one message and then turn the problematic listener back on.

            This interface has been replaced by a new system and I no longer have this problem anymore but wanted you to know there are forces such as this that can make it appear Cloverleaf has a problem when it doesn’t.

            You might want to dump the state 14 message to a file and see if you notice if they are larger than normal when you see this behavior because we certainly did once upon a time.

            I don’t imagine this is all that relevant but this interface wasn’t typical TCP/IP MLP but was straight TCP/IP binary length encoded protocol, using 4 bytes I think but might of been 8 bytes not sure.

            Russ Ross
            RussRoss318@gmail.com

          • #71439
            Gena Gill
            Participant

              I may have the issue resolved now.  Jim’s reply helped me, specifically, “see if there are any procs employed that are keeping those messages around in the recovery database”.

              Normally, we used “SendOK_save” in the Send OK Procs field on the Outbound tab, and it works just fine.  But on the ones where we’re sending through a VPN tunnel and the message sits in the Recovery DB longer before being sent, especially when we get a backlog, they aren’t clearing.  So, I removed the SendOK_save, and it seems to be working OK.

              At least, I don’t have the messages sticking around after they’ve been sent, the 3rd party is receiving their messages, and if there’s a problem, I can re-send them from the save file.  I’m going to leave it off just this one interface, then if everything goes OK, then I’ll consider removing it from the other VPN tunneled interfaces.

            • #71440
              Scott Folley
              Participant

                You would only use SendOK_save if you are awaiting an acknowledgement.  If you are, then you should have inbound-replies processing set up on the outbound tab.

                This also depends heavily on which version of Cloverleaf you have because reply processing is “built-in” to 5.6.  This means that checking Await Replies will auto-magically call the equivalent of SendOK_save.  In this case you should have check_ack or its equivalent in the TPS Inbound Reply stack because that will kill the message when a valid reply is received.

                Hope that helps.

              • #71441
                Gena Gill
                Participant

                  That was definitely the magic combination.  The vendor did not require an ACK, and with this going over the VPN tunnel it didn’t make sense to await the reply.  I had unchecked await reply, and had tons of errors, and then realized I should remove the sendOK_save.

                  I’ve monitored this for a week now, and they are getting their messages just fine, and I’ve only had the odd single message here or there that had a problem, so I’m going to do this for some of our other interfaces that go through these tunnels.

                • #71442
                  Charlie Bursell
                  Participant

                    Without some sort of ACK you will surely lose messages.  This is what we affectionaly call a “Send and Pray” protocol.   😀

                    The primary purpose of an ACK is not for the warm fuzzy feeling that the message was delivered but rather for flow control.  This means that the sender cannot send messages faster that the receiver can accept them.

                    If you send a lot of messages in a short period of time I believe you will lose some.

                    Just my $0.02 worth

                  • #71443
                    Scott Folley
                    Participant

                      Though Charlie certainly doesn’t need me to back him up, I strongly echo his sentiment.  The fact that you are going across a VPN will actually increase the chances of losing messages because it is possible for the connection to remain open when there is nothing on the other end to receive the message.  What will end up happening to you is that you will show that you sent everything because it will be in your outbound SMAT file and yet the receiving system will not have received it.  They will be all over you for not sending the message and you will not have an acknowledgement from them to back you up when you tell them it was sent.  The folks on this forum know that a message that appears in your Outbound SMAT file WAS SENT but you will not likely be dealing with someone on this forum.

                  Viewing 6 reply threads
                  • The forum ‘Cloverleaf’ is closed to new topics and replies.