Cloverleaf Process Unresponsive – read returned error 0

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Cloverleaf Process Unresponsive – read returned error 0

  • Creator
    Topic
  • #55234
    Shane Farney
    Participant

      Hi everyone,

      I have a very simple inbound MDM thread that receives pretty high volume and for whatever reason the process occasionally goes completely unresponsive.  The source of these MDMs is an Epic EPS server so when Cloverleaf goes down, all of the messages that couldn’t be delivered fail and have to be resent via Epic Bridges, causing all kinds of extra work for analysts.  This MDM thread doesn’t really do anything – it simply takes them in, filters a few based on our standard code we use everywhere, and then moves them out.  Nothing obviously weird or unique going on, but I have word that cycling the process helps and might be a bandaid fix if put on a schedule, but I’d like to get to the root of the problem first.

      One thing I have noticed is this very consistent error when viewing output via the Network Monitor:

      [pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:30] read returned error 0 (Error 0)

      [pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:31] read returned error 0 (Error 0)

      [pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:32] read returned error 0 (Error 0)

      [pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:33] read returned error 0 (Error 0)

      [pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:34] read returned error 0 (Error 0)

      [pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:35] read returned error 0 (Error 0)

      Any ideas if this constant erroring is a clue, or is this nothing to be concerned about?  Any advice at all is much appreciated.

      Thanks,

      Shane

    Viewing 8 reply threads
    • Author
      Replies
      • #84654
        Jim Kosloskey
        Participant

          Is the source system disconnecting and reconnecting from time to time?

          What release of Cloverleaf?

          email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

        • #84655
          Shane Farney
          Participant

            6.1.1 on AIX

            The connection is opening/closing for what I assume to be every message.

          • #84656
            Michael Hertel
            Participant

              Check to make sure only one Epic interface is trying to connect to this one.

              It sounds like two different interfaces may be trying to connect to this interface at the same time.

              The next time this happens run netstat -a and grep for the port number.

              See which system connects. Watch it disconnect and repeat the netstat to see if a different host has connected.

              I’ve seen this where test and prod are competing for the same connection because someone forgot to change the port number back.

            • #84657
              Jim Kosloskey
              Participant

                Those PDL errors you see are the source side closing the connection.

                Can’t Bridges keep a persistent connection? That would at least remove those errors.

                Also check what Michael suggested to make sure only one system is connecting.

                I am not sure if cleaning the above up will resolve your ‘hung’ process but it should improve a lot of things (like how can you possibly have a reliable Alert for connection?).

                email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

              • #84658
                aaron kaufman-moore
                Participant

                  Shane,

                  Your experience mirrors ours for our MDM interface from Epic’s EPS outbound documentation. We had the process go down every ~4-6 weeks when running on 6.0.2 but haven’t been up on 6.1.2 long enough to see if the pattern will repeat itself.

                  It isn’t a persistent connection since it comes from Interconnect/EPS and not Bridges making the connection, so what you see is normal.

                  We have our inbound setup as multi-server since we can have multiple sources (the source is clustered and multiple hosts can connect to send messages)

                  We utilized the error settings in Bridges to immediately notify us when we receive an error 56, so the on-call person usually gets into Cloverleaf quickly and restarts the process so that there isn’t too much that needs to be retriggered from Epic

                • #84659
                  Shane Farney
                  Participant

                    That’s good to know this is normal.  I did dig up this from the 6.1.2 release notes which seems to imply that there was a bug with Cloverleaf handling this kind of a connection:

                    Issue

                    Memory leak with MLP PDL (12592)

                    Description

                    A memory leak with MLP PDL could happen with sites that have a large number of messages, for example, a query site that has multiple applications connecting/disconnecting to it.

                    Instead of connecting, sending the query, waiting for the reply, and then disconnecting, the application with the memory leak connects, sends the query, immediately disconnects, and waits for a reply on a separate interface.

                    When Cloverleaf does not initiate the close connection or send a reply first, it does not clean up the memory until the process is stopped.

                    This issue no longer happens.

                  • #84660
                    John Mercogliano
                    Participant

                      We had the same issue on 5.7, the Epic bridges analysis would run a report every day and get 1000’s that failed and had to resend.  We implemented an alert to restart the process.  Since going to 6.1.2, there report has been mostly empty with an occasional report with just a few.

                      So, 6.1.2 seems to have really cleaned things up on that front.

                      John Mercogliano
                      Sentara Healthcare
                      Hampton Roads, VA

                    • #84661
                      Shane Farney
                      Participant

                        John Mercogliano wrote:

                        We had the same issue on 5.7, the Epic bridges analysis would run a report every day and get 1000’s that failed and had to resend.

                      • #84662
                        Tom Rioux
                        Participant

                          Shane,

                          We have the same issue here.   Every time it re-connects to Interconnect, it throws a “read return error 0” message into our log file.    I’m assuming this is due to the pdl.  

                          Did you modify your pdl or use a different one?   I was wondering if there is a way to prevent the error from being logged.

                          Thanks…

                          Tom Rioux

                      Viewing 8 reply threads
                      • The forum ‘Cloverleaf’ is closed to new topics and replies.