Ack issues

  • Creator
    Topic
  • #51182
    Vaughn Skinner
    Participant

      I have a situation where an LIS system sent a bunch (12000) of messages to cloverleaf.  Cloverleaf had a few at the end which it would send 2 acks on, one that reported an error and another that appears to be a normal ok.  The sending system notes the error and tries again.  Cloverleaf takes the sent message and passes it resulting in a lot of duplicates.

      Any ideas about how this situation could arise and how to prevent it?

      Here is the tcpdump output:

      MSH|^~&|Cloverleaf||||200909112127||ACK||P|2.2|^MMSA|AR||Invalid MSH segment^M.

      MSH|^~&|||LABDAQ||200909112127||ACK|20090911165007897|P|2.4|^MMSA|AA|20090911165007897|^M.

      I’ve attached the relevant protocol section from the connection thread.

         { PROTOCOL {

             { CA_FILE {} }

             { CA_PATH {} }

             { CERT_FILE {} }

             { CLOSE 0 }

             { CONTROLMSGS 0 }

             { COPYCLIENTIPP 0 }

             { HOST localhost }

             { ISMULTI 0 }

             { ISSERVER 1 }

             { IS_SSL 0 }

             { LOCAL_IP {} }

             { MAXCLIENT 0 }

             { MAXOBQD 0 }

             { MAXPREXLTQD 0 }

             { MODE {} }

             { PASSWORD {} }

             { PDLNAME mlp_tcp.pdl }

             { PDLTYPE tcp-server }

             { PORT 7021 }

             { PRIVATE_KEY {} }

             { RECONNECT 1 }

             { REOPEN 5 }

             { SSL_PROTOCOL All }

             { TYPE pdl-tcpip }

         } }

    Viewing 5 reply threads
    • Author
      Replies
      • #69081
        Jim Kosloskey
        Participant

          Vaughn,

          What reply generastion proc are you using?

          Have you analyzed the inbound message(s) that were being responded to (they should be in the inbound SMAT) and matched them up with the acknowledgments sent?

          If you are seeing one message received and then two acknowledgments before the next message is sent, take a look at your reply proc.

          email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

        • #69082
          Russ Ross
          Participant

            The time I saw something similar to this was when the time out for a resend was shorter than the amount of time it took cloverleaf to send back the acks.

            When many messages are backlogged for sending this can become more obvious like in your case possibly.

            In my case I noticed this when I cycled the thread because it took about 10 seconds for the thread to start up and the resend time on the foriegn system interface sender was set to one second so I had the same message resent 10 times on thread start up.

            By resetting the resend time out on the foriegn sending interface to something reasonable the problem was solved.

            I think most of our resend time outs are set around 30 – 60 seconds these days.

            Check to see how short the time out is before the foriegn lab interface waits to resend a message.

            Russ Ross
            RussRoss318@gmail.com

          • #69083
            Russ Ross
            Participant

              After reading your post more closely a second time this morning, the thought about the AR error (invalid MSH segment) made me realize this isn’t likely to be a resend time out issue.

              Try looking in SMAT to see if you got incomplete messages that don’t start with MSH.

              I will not be surprised if you see some messages that are incomplete or split into parts.

              Seems like I even recall some clovertech posts about messages being split into parts and how to resolve if that turns out to be your issue.

              Sometimes I’ve even seen vendors send just rn.

              Looking at SMAT hopefully will help you define the problem more clearly rather than just dealing with symptoms of the problem.

              Russ Ross
              RussRoss318@gmail.com

            • #69084
              Vaughn Skinner
              Participant

                The only thing I see unusual in the SMAT is that the len10 numbers are too long.  It looks like there was a zero length message and then a regular message.  This is the case for 15 messages (repeated many times) out of the 12000 batch.

                00000000000000002915MSH|^~&|LABDAQ||||200909111650||ORU^R01|20090911165007897|P|2.4|||||||||

                The labdaq service was restarted many times during this batch (because it is single threaded and otherwise wouldn’t receive orders).  We are wondering if this might be the source of these 0 length messages.

                I have tcpdump for the entire batch.  Perhaps the solution is to change RawHl7Ack to kill zero length messages?

                Please note the 0b1c0d in this first transmission (at 0x28).   Is this a 0 length message?  I see this in the vicinity each of the ‘Invalid MSH segments’ responses.

                16:15:33.924169 IP labdaq -> cloverleaf P 1197:1200(3) ack 92 win 256

                 0x0000:  4500 002b 3536 4000 8006 082e c0a8 1e0e  E..+56@………

                 0x0010:  c0a8 1e0a f5ef 1b6d ff61 4b9c 6fc0 4fdc  …….m.aK.o.O.

                 0x0020:  5018 0100 bd4c 0000 0b1c 0d00 0000       P….L……..

                16:15:33.928706 IP labdaq.62959 > cloverleaf.dpserveadmin: F 1200:1200(0) ack 92 win 256

                 0x0000:  4500 0028 3537 4000 8006 0830 c0a8 1e0e  E..(57@….0….

                 0x0010:  c0a8 1e0a f5ef 1b6d ff61 4b9f 6fc0 4fdc  …….m.aK.o.O.

                 0x0020:  5011 0100 d56f 0000 0000 0000 0000       P….o……..

                16:15:33.940898 IP cloverleaf.dpserveadmin > labdaq64.62959: P 92:172(80) ack 1201 win 65

                 0x0000:  4500 0078 3d93 4000 4006 3f84 c0a8 1e0a  E..x=.@.@.?…..

                 0x0010:  c0a8 1e0e 1b6d f5ef 6fc0 4fdc ff61 4ba0  …..m..o.O..aK.

                 0x0020:  5018 0041 bdd3 0000 0b4d 5348 7c5e 7e5c  P..A…..MSH|^~

                 0x0030:  267c 436c 6f76 6572 6c65 6166 7c7c 7c7c  &|Cloverleaf||||

                 0x0040:  3230 3039 3039 3131 3136 3135 7c7c 4143  200909111615||AC

                 0x0050:  4b7c 7c50 7c32 2e32 7c0d 4d53 417c 4152  K||P|2.2|.MSA|AR

                 0x0060:  7c7c 496e 7661 6c69 6420 4d53 4820 7365  ||Invalid.MSH.se

                 0x0070:  676d 656e 740d 1c0d                      gment…

              • #69085
                Vaughn Skinner
                Participant

                  I sent a message to the server with the same packet load as in the tcpdump message above and generated the error message.

                  perl -e “printf(‘%c%c%c%c%c%c’,0xb,0x1c,0xd,0,0,0);” | nc localhost 7020 > zz

                  cat zz

                  ^KMSH|^~&|Cloverleaf||||200909150621||ACK||P|2.2|^MMSA|AR||Invalid MSH segment^M^^M

                  Any recommendations about how best to squelch this in cloverleaf?  I can modify the RawHL7Ack easily enough to drop the ack but are there other ramifications?

                • #69086
                  Jim Kosloskey
                  Participant

                    Vaughn,

                    That first message appears to just be the mlp encapsulation characters. That most likely would fail your reply proc (as it should).

                    I would ask the sending system to fix their system. Sending an empty message (just the encapsulation set) is not correct.

                    email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

                Viewing 5 reply threads
                • The forum ‘Cloverleaf’ is closed to new topics and replies.