Thread backing up regularly, PDL signaled exception: code 1, msg write failure

Clovertech Forums Cloverleaf Thread backing up regularly, PDL signaled exception: code 1, msg write failure

  • Creator
    Topic
  • #120150
    Joe Baranski
    Participant

      When working with a vendor on a high-volume outbound interface which keeps going down their system is saying Cloverleaf is closing the connection.  Looking at our logs the times correlate with the following errors…

      [pdl :PDL :ERR /0:mt_vitals_oru_ob:01/12/2023 23:08:53] write timeout expired
      [pdl :PDL :ERR /0:mt_vitals_oru_ob:01/12/2023 23:08:53] PDL signaled exception: code 1, msg write failure

      Searching these forums I did find a spattering of 8+ year old threads on a similar trail but there were no solutions included.  Because of the high volume we had turned off waiting/processing of ACKs to see if that’d help, but to no avail.

      Cloverleaf: 19.1.2.1P

      AIX: 7.1

    Viewing 9 reply threads
    • Author
      Replies
      • #120151
        Jim Kosloskey
        Participant

          Just looking at those log entries I would hazard a guess that the issue is with the receiving system. PDL’s write to that port has timed out. I would ask the receiving system to give me more detail (perhaps from one of their logs). You could also deploy a sniffer, if your organization has one, for more specifics.

          My guess is the receiving system is late in replying at the IP level. If that is the case, they will need to investigate their side for more detail.

          These are just guesses – I could be way off base.

           

          email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

        • #120152
          Joe Baranski
          Participant

            Thanks for the quick reply!

            I’m checking with our network team on the sniffer.  And the receiving system did send basic log info; for the error above the correlating error is “01/12/23 2308 Connection Closed by Remote Host.”  This has just been an ongoing problem which can result is significant downtimes.

          • #120153
            Jim Kosloskey
            Participant

              Are you going through a VPN?

              email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

            • #120154
              Joe Baranski
              Participant

                Nope, inhouse server to inhouse server.

              • #120156
                Paul Stein
                Participant

                  Have you tried using the native TCP/IP protocol driver and using the encapsulated mode of just ‘mllp’?

                  Also curious since I see write timeouts, if you need to increase your encapsulation timeout. This sometimes happens to me when sending large base64 encoded PDFs. If you have a large file to write and you hit your encapsulation timeout, the connection will be reset and the message will try to send again, over and over if you reply handling on.

                • #120161
                  Jim Kosloskey
                  Participant

                    Joe – be sure to let us know what you determine. But Paul’s suggestion about adjusting the wait time in the PDL (I think that can be set there) is a good one as is the even better suggestion of Paul’s to move away from PDL to TCP/IP MLP if you can. I don’t think PDL will ever be enhanced in a future release and there might come a time when it is deprecated.

                    email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

                  • #120165
                    Joe Baranski
                    Participant

                      This particular interface is actually rather small, just high volume vitals from a Pt Monitor.  pdl-tcpip is the standard we use on ALL interfaces that aren’t files, so if there’s a chance it goes away that’s gonna cause a lot of conversion work on our end.

                       

                      If we were to convert this particular interface to tcpip with MLLP, would the change be completely transparent to the receiving system or are there additional tweaks that would need to be made?

                      thanks,

                      -Joe

                    • #120167
                      Paul Stein
                      Participant

                        I had to make an on the fly decision to move from PDL to native TCP/IP -MLLP for a downstream issue in PROD, and had no issue. Obviously, I suggest testing this in a non-PROD environment if possible.

                        So, assuming the downstream vendor supports MLLP, they should have no issue with this change

                        More info on what this driver uses via help documentation for your vendor:
                        <p class=”p”>The encapsulation is defined as a start block, the message, and an end block.</p>

                        <ul id=”bzt1491837533627__ul_EA25813BAD704C37A3231F4AE1001E1F” class=”ul”>
                        <li id=”bzt1491837533627__li_21DEB550AFDC411D9994C46E2146F6AC” class=”li”>The start block consists of a single byte with a value of 0x0B. This isthe ASCII VT character code.
                        <li id=”bzt1491837533627__li_31E6FBDB19B74B3A98D308F9DF3F44B5″ class=”li”>The end block consists of two characters with the values 0x1C followed by 0x0D. This is the ASCII characters codes for FS and CR.

                        Description for MLLP:
                        <p class=”p”>This is the default. The Start BlockEnd BlockCommit ACK, and Negative ACK options are disabled, so that the standard MLLP Start Block and End Block are used.</p>
                        <p class=”p”>The default Timeout is 30 seconds and the corresponding Timeout Handling is RESET. Both are selected automatically if no user-specified settings are loaded.</p>

                      • #120198
                        Joe Baranski
                        Participant

                          I haven’t had the opportunity to convert this interface yet.  However, we are currently in the midst of severe backups with it.  Doublechecking the log and errors for the process, we are seeing hundreds of thousands of instances of the following error:

                          [tps :tps :ERR /0:mt_vitals_oru_ob:01/25/2023 05:27:15] ‘KILL ‘ (returned by ‘validate_hl7ack ‘) does not match { <key> <value> }

                          I tried digging through the proc but couldn’t find anything specific to that error.

                        • #120199
                          Joe Baranski
                          Participant

                            Disregard the last update, we were missing a proc

                        Viewing 9 reply threads
                        • You must be logged in to reply to this topic.