Inbound disconnected, but CL still shows "UP"

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Inbound disconnected, but CL still shows "UP"

  • Creator
    Topic
  • #52114
    Jeff Dinsmore
    Participant

    I’ve had three instances in the last week where an inbound connection shows “UP”, but the sending end is disconnected and is refused connection when it tries to reconnect.

    I don’t know if this is a Cloverleaf problem, a problem with the sender, or both, but I do know I need to fix it so it doesn’t fail in the middle of the night – I’m not getting enough sleep ;o)

    So, my thought is to attack this from two angles.

    1) config the inbound connection to be Multi-Server so that the remote can reconnect even though CL thinks its’ still connected.

    2) as a fallback, config an alert to bounce the listener if it doesn’t receive any messages for N minutes.

    Does that sound reasonable?

    Any techniques that might work better?

    Thanks!

    Jeff Dinsmore
    Chesapeake Regional Healthcare

Viewing 21 reply threads
  • Author
    Replies
    • #73123
      Rob Abbott
      Keymaster

      Normally we see this if there’s a firewall or something in between the endpoints that drops or times out the connection without notifying either side.

      The Multi-server configuration should fix it up; make sure if you have a reply proc that it’s configured properly for multi-server.

      Rob Abbott
      Cloverleaf Emeritus

    • #73124
      Jeff Dinsmore
      Participant

      No firewall involved – all devices are on the same LAN.

      I’m a relative CL newbie. How should a reply proc be properly configured for Multi-Server?

      Jeff Dinsmore
      Chesapeake Regional Healthcare

    • #73125
      Chris Williams
      Participant

      Jeff,

      This issue is not necessarily restricted to firewalls. It can be any switch or router on your LAN through which the connection passes. They all have time-out parameters that can cause you grief.

    • #73126
      Jeff Dinsmore
      Participant

      We discovered that this is the result of a daily snapshot backup of our Cloverleaf virtual machine.

      Right at the end of the backup, the sending client senses a disconnect and goes into a reconnection mode. Cloverleaf shows the connection as still “up”, and so refuses connection.

      The solution for now is to not do the backup, but that’s not a good solution either.

      Have any of you seen similar behavior?

      Others out there running Cloverleaf on VMWare?

      Jeff Dinsmore
      Chesapeake Regional Healthcare

    • #73127
      James Cobane
      Participant

      Jeff,

      One work-around to this might be to define your connection as “Multi-Server” to allow multiple connections.  We’ve done this on some threads where the vendor doesn’t always cleanly break and then wants to re-connect.

      Hope this helps.

      Jim Cobane

      Henry Ford Health

    • #73128
      Robert Denny
      Participant

      We are having a similar issue. Win2003/CL5.3rev3.

      We have one lab that we send orders and receive results from, they utilize a virtual ip address system.  We have had issues where for one reason or another they flip between the two nic cards that are attached to the virtual address. We are setup with both address on our vpn tunnel, but if there is any disruption in the connection. We lose connectivity and have to manually change from the one NIC card address to the other NIC card address.

      Would the multi configuration work with this issue?

    • #73129
      Chris Roca
      Participant

      We are running 5.7 in a virtual environment and had issues losing connectivity with threads haphazardly.

      We pinpointed the problem to our VMware, ‘vmotioning’ the cloverleaf server to another virtual machine. Once we anchored the cloverleaf server to one VM Host the issue went away. Doesn’t help with HA but stablized the environment.

      Any other solutions are welcome

    • #73130
      Jeff Dinsmore
      Participant

      It seems that the problem for us was the Avamar software we’re using to back up our VMs. We were able to reproduce the error when we ran the backup in the middle of the day.

      When the backup completed, it would disconnect the interface and it would not reconnect. Odd that it didn’t do that to other interfaces.

      We’re currently investigating if this is happening on our Horizon Clinicals (CareLink) interface as well.

      Jeff Dinsmore
      Chesapeake Regional Healthcare

    • #73131
      Ian Morris
      Participant

      Just wanted to let everyone know that we experienced the exact same Up-but-sporadically-not-receiving-messages issue as the OP.  We worked with Support who advised us to change our thread to multi server.  Since then, 20 hours ago, we have not had the issue.  Probably still too early to say it’s fixed but wanted to share that with everyone.

    • #73132
      Jeff Dinsmore
      Participant

      I tried multi-server as well, but discovered that message transfer was then painfully slow – several seconds per message.

      I have not had the opportunity to dig into why.

      Jeff Dinsmore
      Chesapeake Regional Healthcare

    • #73133
      Ian Morris
      Participant

      We also have a theory…Our latest server is a VM Red Hat 5.3 server.

    • #73134
      Rob Abbott
      Keymaster

      Jeff – multiserver should not affect performance like you describe.  Note that if you have any procs that generate acknowledgments or other outbound traffic to a multiserver connection you have to populate DRIVERCTL with a CONNID key – something like this:

      Code:

      msgmetaset $ackMh DRIVERCTL [msgmetaget $mh DRIVERCTL]

      If you don’t have this logic in place then you wouldn’t be sending an ACK and the other end may be timing out.

      Hope this helps.

      Rob Abbott
      Cloverleaf Emeritus

    • #73135
      Bevan Richards
      Participant

      I have had the exact same issue happening. We use Win 2008 with CL5.7 rev 2. Latency allerts are not an option because the time between messages varies during the night.

    • #73136
      Ian Morris
      Participant

      Bevan Richards wrote:

      I have had the exact same issue happening. We use Win 2008 with CL5.7 rev 2. Latency allerts are not an option because the time between messages varies during the night.

      Did you try configuring your thread as multiserver?

    • #73137
      Jeff Dinsmore
      Participant

      I’d like to revisit this topic.

      Since I originally posted this I’ve gained a better understanding of how Cloverleaf works, and it would appear that this “showing up, but not communicating” state is caused by an abrupt severing of the connection between Cloverleaf and a given connection partner.

      Whether the disconnect is caused by the network or the other end of the connection is of no real concern.  The primary issue is that Cloverleaf, for whatever reason, doesn’t sense the disconnect.

      We primarily see these disconnects on outbound clients – when an outbound queue builds up – so multi-server doesn’t help with that.

      We’re currently running CL5.6.  Do more recent versions handle these disconnect events better?

      Do any of you use other techniques, besides setting an alert to auto-restart or tweaking network protocols, to sense/recover from this type of failure?

      Jeff Dinsmore
      Chesapeake Regional Healthcare

    • #73138
      Robert Milfajt
      Participant

      That just about covers the options right there.  The problem is with the OSI model for communication and the fact that the disconnect happens at a lower level (physical, data link, network-IP or transport-TCP) and that Cloverleaf, running at the application layer, is not informed of this.  The beauty of this model and keeping things separate, so that you don’t have to write code for TCP, etc., is also one of its problems, i.e., how to detect this.

      Hope this helps,

      Robert Milfajt
      Northwestern Medicine
      Chicago, IL

    • #73139
      mike brown
      Participant

      Hi i opened a INFOR Ticket i have several inbounds that need the multi-server setup.

      But can someone provide me the exact steps on how to create this in a new hl7_raw_ack proc.

      mike

    • #73140
      Terry Kellum
      Participant

      We are on Redhat 5.3, on ESX 4.1.  We have our NICs set to “Flexible”.  We don’t have any issues with failover.

    • #73141
      David Coffey
      Participant

      This issue lives in my environment.  Clover 5.8.5 running on Windows 2008.  

      Using Wireshark I have seen the other system send a RST packet which as a request to tear down the connection.  This request is never honored.   The stack should honor the request pass it up to Clover and the Clover thread should recycle and go into a Listen state.  

      At this time I do not know if the issue is the Windows stack or Clover.

      I believe running in MultiServer is a hack to address a flaw.  It also introduces a small security risk, leave an interface with a permanent Listen pending for anyone to connect to and inject something into the interface.

    • #73142
      Terry Kellum
      Participant

      We started on Windows in 2003, and had TCP issues before we went live in 2004.  We ditched windows and deployed on Red Hat.  You have a different set of things to look at, and a more specialized Sys Admin environment, but I’ve never regretted that decision.  It’s important if you deploy on Linux that you make the system tweaks listed in the installation instructions.

    • #73143
      James Cobane
      Participant

      David,

      I believe the issue is at a lower level within the OSI than the application layer (where Cloverleaf is).  Essentially, Cloverleaf is not being informed of the disconnect from the OS communications, so it believes it is still connected.

      I think the multi-server option to address this is more of a “work-around” than a “hack” 😉

      Jim Cobane

      Henry Ford Health

    • #73144
      David Coffey
      Participant

      Work around or hack.   Either way it should not have to be done.

Viewing 21 reply threads
  • The forum ‘Cloverleaf’ is closed to new topics and replies.

Forum Statistics

Registered Users
5,117
Forums
28
Topics
9,292
Replies
34,435
Topic Tags
286
Empty Topic Tags
10