Cloverleaf Process Unresponsive – read returned error 0

This topic has 9 replies, 6 voices, and was last updated 9 years, 3 months ago by Tom Rioux.

Creator

Topic
October 18, 2016 at 3:37 pm #55234
Shane Farney
Participant
Hi everyone,

I have a very simple inbound MDM thread that receives pretty high volume and for whatever reason the process occasionally goes completely unresponsive. The source of these MDMs is an Epic EPS server so when Cloverleaf goes down, all of the messages that couldn’t be delivered fail and have to be resent via Epic Bridges, causing all kinds of extra work for analysts. This MDM thread doesn’t really do anything – it simply takes them in, filters a few based on our standard code we use everywhere, and then moves them out. Nothing obviously weird or unique going on, but I have word that cycling the process helps and might be a bandaid fix if put on a schedule, but I’d like to get to the root of the problem first.

One thing I have noticed is this very consistent error when viewing output via the Network Monitor:

[pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:30] read returned error 0 (Error 0)

[pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:31] read returned error 0 (Error 0)

[pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:32] read returned error 0 (Error 0)

[pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:33] read returned error 0 (Error 0)

[pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:34] read returned error 0 (Error 0)

[pdl :PDL :ERR /0:EPIisMDM11812:10/18/2016 10:17:35] read returned error 0 (Error 0)

Any ideas if this constant erroring is a clue, or is this nothing to be concerned about? Any advice at all is much appreciated.

Thanks,

Shane
Creator

Topic

Viewing 8 reply threads

Author

Replies
- October 18, 2016 at 5:30 pm #84654
  Jim Kosloskey
  Participant
  Is the source system disconnecting and reconnecting from time to time?
  
  What release of Cloverleaf?
  
  email: jim.kosloskey@jim-kosloskey.com 30+ years Cloverleaf, 61 years IT – old fart.
- October 18, 2016 at 5:46 pm #84655
  Shane Farney
  Participant
  6.1.1 on AIX
  
  The connection is opening/closing for what I assume to be every message.
- October 18, 2016 at 6:41 pm #84656
  Michael Hertel
  Participant
  Check to make sure only one Epic interface is trying to connect to this one.
  
  It sounds like two different interfaces may be trying to connect to this interface at the same time.
  
  The next time this happens run netstat -a and grep for the port number.
  
  See which system connects. Watch it disconnect and repeat the netstat to see if a different host has connected.
  
  I’ve seen this where test and prod are competing for the same connection because someone forgot to change the port number back.
- October 18, 2016 at 9:13 pm #84657
  Jim Kosloskey
  Participant
  Those PDL errors you see are the source side closing the connection.
  
  Can’t Bridges keep a persistent connection? That would at least remove those errors.
  
  Also check what Michael suggested to make sure only one system is connecting.
  
  I am not sure if cleaning the above up will resolve your ‘hung’ process but it should improve a lot of things (like how can you possibly have a reliable Alert for connection?).
  
  email: jim.kosloskey@jim-kosloskey.com 30+ years Cloverleaf, 61 years IT – old fart.
- October 19, 2016 at 1:01 pm #84658
  aaron kaufman-moore
  Participant
  Shane,
  
  Your experience mirrors ours for our MDM interface from Epic’s EPS outbound documentation. We had the process go down every ~4-6 weeks when running on 6.0.2 but haven’t been up on 6.1.2 long enough to see if the pattern will repeat itself.
  
  It isn’t a persistent connection since it comes from Interconnect/EPS and not Bridges making the connection, so what you see is normal.
  
  We have our inbound setup as multi-server since we can have multiple sources (the source is clustered and multiple hosts can connect to send messages)
  
  We utilized the error settings in Bridges to immediately notify us when we receive an error 56, so the on-call person usually gets into Cloverleaf quickly and restarts the process so that there isn’t too much that needs to be retriggered from Epic
- October 20, 2016 at 1:36 pm #84659
  Shane Farney
  Participant
  That’s good to know this is normal. I did dig up this from the 6.1.2 release notes which seems to imply that there was a bug with Cloverleaf handling this kind of a connection:
  
  Issue
  
  Memory leak with MLP PDL (12592)
  
  Description
  
  A memory leak with MLP PDL could happen with sites that have a large number of messages, for example, a query site that has multiple applications connecting/disconnecting to it.
  
  Instead of connecting, sending the query, waiting for the reply, and then disconnecting, the application with the memory leak connects, sends the query, immediately disconnects, and waits for a reply on a separate interface.
  
  When Cloverleaf does not initiate the close connection or send a reply first, it does not clean up the memory until the process is stopped.
  
  This issue no longer happens.
- October 20, 2016 at 2:29 pm #84660
  John Mercogliano
  Participant
  We had the same issue on 5.7, the Epic bridges analysis would run a report every day and get 1000’s that failed and had to resend. We implemented an alert to restart the process. Since going to 6.1.2, there report has been mostly empty with an occasional report with just a few.
  
  So, 6.1.2 seems to have really cleaned things up on that front.
  
  John Mercogliano
  Semi Retired, contractor
  Hampton Roads, VA
- October 20, 2016 at 4:03 pm #84661
  Shane Farney
  Participant
  ~~John Mercogliano wrote:~~
  
  We had the same issue on 5.7, the Epic bridges analysis would run a report every day and get 1000’s that failed and had to resend.
- February 24, 2017 at 5:05 pm #84662
  Tom Rioux
  Participant
  Shane,
  
  We have the same issue here. Every time it re-connects to Interconnect, it throws a “read return error 0” message into our log file. I’m assuming this is due to the pdl.
  
  Did you modify your pdl or use a different one? I was wondering if there is a way to prevent the error from being logged.
  
  Thanks…
  
  Tom Rioux
Author

Replies

Viewing 8 reply threads

The forum ‘Cloverleaf’ is closed to new topics and replies.