CL5.7 rev 2 – aix — memory leak with inbound tcp

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf CL5.7 rev 2 – aix — memory leak with inbound tcp

  • Creator
    Topic
  • #52311
    Ryan Spires
    Participant

    Has anyone encountered any issues with inbound tcp/ip connections that continually bounce and reconnect between messages.

    We have an inbound tcp/ip socket (server connection) that is getting errors pretty consistantly in the process error and log…

    typically i would consider the errors informational, however, we have discovered what appears to be a potential over consumption of memory.

    Upon bouncing the process associated we had nearly 20% of our memory gained back.  Coincidental, possibly, but does cause me to look more closely

    The pdl signal exception we are getting is as follows;

    [pdl :PDL :ERR /0: fr_ghhsm_rpt:03/02/2011 13:42:34] read returned error 0 (Error 0)

    [pdl :PDL :ERR /0: fr_ghhsm_rpt:03/02/2011 13:42:34] PDL signaled exception: code 1, msg device error (remote side probably shut down)

    The connection is set to reset itself every 5 seconds if connection fails, which promptly reconnects as expected, then drops.

    I have tried setting to multi-server, however, this just causes the error to occur as frequently if not more so.

    Anyone encounter this or  have any thoughts…. —the obvious (have the vendor fix their system) comes to mind but in the meantime, i thought I would poll the group.

    thanks,

    Ryan Spires

Viewing 11 reply threads
  • Author
    Replies
    • #73751
      Bob Richardson
      Participant

      Greetings,

      We are also running CIS5.7R2 on AIX 5.3 TL12.

      This is more like an informational error that the sender has disconnected and Cloverleaf is listening for the sender to reconnect.  We have inbound interfaces where the sender only connects to send messages then disconnects leaving the Cloverleaf server in an “opening” status.

      Nothing unusual at least in our experience.

      As for the memory leak, we have not seen any so far for our inbound interfaces (we use PDL/TCP protocol).

      Have you checked that none of the TCL procedures in the process are leaking message and/or global handles?   That would contribute to memory usage creep over time.

      Hope this helps.

    • #73752
      David Barr
      Participant

      I haven’t seen problems with memory leaks related to connects and disconnects. I’d check your TCL procs to see if you’re doing something wrong there.

      I fixed a memory leak once that was related to an HL7 message parsing library we were using. This particular library created a new TCL command for each message that it parsed (so that it could use more of an object-oriented syntax). The people who were using this library didn’t realize that you had to call a cleanup proc to delete the new command and associated global data.

    • #73753
      Ryan Spires
      Participant

      Thanks for the replies…  I’ll check again to see, we were only using the rawHl7ack Proc and one other filter proc… no Xlate in this case….

      Only two threads in this particular process…

      Ryan Spires

    • #73754
      Bob Richardson
      Participant

      Greetings again,

      Does your raw HL7 proc use the GRM calls?

      If true there may be leaks in that TCL if the GRM variables like

      datlist etc. are not cleaned up.  

      Check the TCL library in the Clovertech Forum for a version that

      uses the split message technique.  More performance efficient.

      Good hunting!

    • #73755
      Ryan Spires
      Participant

      The two procs, 1 was basically a modified version of a RawHl7Ack did not have any grm statements, pretty basic validation… –Does it have an MSH type stuff and then creating the Ack from hardcoded values… nothing special

      The other proc is a filter, which does call two other procs internally to do segment parsing and field parsing.  Both of those procs are in the same file and they are not doing GRM either.  —not the way i would have written to filter, but to each their own I guess.

      In any case, I really don’t see an issue with the procs at this point.

      Looks like conditions will always be met to either CONTINUE or KILL  the  handle based upon the conditions…and no way to  fall through without.

      I have turned up my e/o for the connection just to see some more noice… I haven’t done so yet for the process, but I will be doing that next.

      [pdl :open:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:28] Scheduling driver reopen try in 15.0 secs

      [pd  :pdtd:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:28] Set driver status to PD_STATUS_OPENING

      [pti :sche:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:28] Thread has 0 ready events left.

      [pti :sche:INFO/2: fr_ghhsm_rpt:03/04/2011 08:21:28] Performing apply callback for thread 3

      [pti :sche:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:43] Thread has 1 ready events.

      [pdl :open:INFO/0: fr_ghhsm_rpt:03/04/2011 08:21:43] Driver attempting reopen

      [pti :sche:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:43] Thread has 0 ready events left.

      [pti :sche:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:43] Thread has 1 ready events.

      [pti :sche:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:43] Thread has 0 ready events left.

      [pti :sche:INFO/2: fr_ghhsm_rpt:03/04/2011 08:21:43] Performing apply callback for thread 3

      [pti :sche:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:44] Thread has 1 ready events.

      [pd  :pdtd:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:44] Set driver status to PD_STATUS_UP

      [pti :sche:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:44] Thread has 0 ready events left.

      [pti :sche:INFO/2: fr_ghhsm_rpt:03/04/2011 08:21:44] Performing apply callback for thread 3

      [pti :sche:INFO/1: fr_ghhsm_rpt:03/04/2011 08:21:44] Thread has 1 ready events.

      [pdl :PDL :INFO/0: fr_ghhsm_rpt:03/04/2011 08:21:44] read nothing (link closed)

      [pdl :PDL :ERR /0: fr_ghhsm_rpt:03/04/2011 08:21:44] read returned error 0 (Error 0)

      [pdl :PDL :INFO/0: fr_ghhsm_rpt:03/04/2011 08:21:44] no PDL exception handler registered => input error

      [pdl :PDL :INFO/0: fr_ghhsm_rpt:03/04/2011 08:21:44] input-error in dfa ‘basic-msg’

      [pdl :PDL :ERR /0: fr_ghhsm_rpt:03/04/2011 08:21:44] PDL signaled exception: code 1, msg device error (remote side probably shut down)

      I did notice something just above “input-error in dfa ‘basic-msg'”

      Then the connection drops (goes to opening).. I am not seeing what actually is hitting the pdl.    Again this may very well be normal, the pdl being used is the mlp_tcp.pdl, and is in use just about everywhere, so i don’t really think it is the issue or it most definitely would have been noticed by others.

    • #73756
      Rob Abbott
      Keymaster

      It looks like the remote end is connecting and then immediately closing the connection.  Since they are not sending any data, nothing hits the PDL other than a close of the session.

      You might want to run an IP trace on this port to see exactly what’s happening at the network level.

      Rob Abbott
      Cloverleaf Emeritus

    • #73757
      Ryan Spires
      Participant

      The system we are interfacing with is McKesson Horizon Surgery. (HSM)

      Anyone else interfacing with this product, inbound reports containing embedded pdf (base 64) in HL7.

      We do have an inbound charge interface from the same product that is behaving.

    • #73758
      Ted Viens
      Participant

      Here is additional information that was found.

      – The sending system is sending an FIN ACK, which kills the connection, immediatley after receiving the HL7 ACK.

      – The inbound thread over time consumes the RAM and necessitates a reboot of the server causing PROD impact.

      – Our last reboot was on 6/28.

      Questions:

      – I am not sure why a transient connection to the IB TCP Server would cause a memory leak on the server.  Can anyone clarify?

      – What can be done to eliminate the issue?  Would moving the TCL procs to an bridge receive solve the problem?

      – Is changing the TCP/Client and TCP/Server be an option?  This is not typical, but we could set our inbound up as a TCP/Client and connect to the reconfigured outbound, TCP/Server, on the application side.

    • #73759
      Ted Viens
      Participant

      CORRECTION – There are no TCL Procs being hit on the inbound thread.

      We are receiving 64 bit encoded data in OBX.5.

    • #73760
      Michael Hertel
      Participant

      Quote:

      Has anyone encountered any issues with inbound tcp/ip connections that continually bounce and reconnect between messages.

      Is it possible that someone has two copies of the same external interface running?

      We will see this when someone’s test and prod interfaces are configured and turned on to connect to the same Cloverleaf host and port.

    • #73761
      Brad Dorr
      Participant

      Just an update on this issue.  It is really becoming a pain in the neck.  It has even stopped the server 2 times now and caused us to reboot the AIX server.  We get a  “No buffer space available” error on the thread and then if you stop anything you cannot get it started so we have to reboot.  Even though interfaces are running if they get stopped then it cannot restart nor can you use command line commands, GUI, nothing.  If anyone has any ideas I would be glad to chat.  AIX Unix 6.1.4

    • #73762
      Michael Hertel
      Participant

      Brad,

      No buffer space available happened to us once.

      Turned out the source system was not configured to use ack logic.

      They just kept dumping on us with huge transcription messages because they weren’t evaluating (waiting) for our ack messages from the engine.

      They turned on the “use ack” logic and solved our issue.

Viewing 11 reply threads
  • The forum ‘Cloverleaf’ is closed to new topics and replies.

Forum Statistics

Registered Users
5,117
Forums
28
Topics
9,292
Replies
34,432
Topic Tags
286
Empty Topic Tags
10