engine hung with no errors or other warnings

This topic has 6 replies, 5 voices, and was last updated 14 years, 3 months ago by Tim Gobbel.

Creator

Topic
March 19, 2011 at 12:57 am #52358
Tim Gobbel
Participant
We are on 5.6 rev 2 (??) on AIX. Tonight the engine just stopped processing messages with no errors or other signs and symptoms. I blamed the vendor until they called back and said it wasn’t them (Right!) but then I did find that the engine had stopped receiving and thus sending message to them. I checked the logs but saw nothing. I re-booted CL and the messages started to flow. Anyone experience this before? any ideas as to why? Thanx!
Creator

Topic

Viewing 5 reply threads

Author

Replies
- March 21, 2011 at 1:55 pm #73899
  Kevin Scantlan
  Participant
  Do you have automatic process log cycling turned on? We had problems with that and backed most of it out. We are also on CL 5.6 rev2 with AIX 5.3 . What was the last thing that showed in the process log in question?
- March 21, 2011 at 2:04 pm #73900
  Mark McDaid
  Participant
  Hi Tim,
  
  I have one particular interface with our radiology vendor that has been problematic over the past 3 years. The problem is that once every 2-3 weeks, the process hangs with no alert and no indication of any problem. This occurred when we were on version 5.2, and it still occurs now that we are on version 5.6 rev 2. There has to be something coming across that interface from radiology, but I’ve never been able to figure out the root cause of the process hanging. So, my solution (work-around) was to put the radiology interfaces in their own site within Cloverleaf. Within this site there are 2 processes, ARA and monitorARA. The ARA process contains the actual interface threads, the monitorARA process is used for the purpose of detecting when the ARA process becomes hung. There are 2 threads within the monitorARA process, a timer thread that creates a simple “hello” message every 30 seconds, and another thread that sends the message to a third thread (fr_monitorARA) via a TCP/IP hop, that is in the actual ARA process. I have an alert setup to go off when this fr_monitorARA thread has not received a message in more than 45 seconds (since it should be receiving a message every 30 seconds from the monitorARA process). I then wrote a script that handles all of the shutdown, clean-up, and restart of the ARA process. This script is set to fire when the alert fires. This setup has worked well for the past 2 years. It doesn’t solve the cause of the problem, but it keeps the interfaces going without human intervention.
- March 21, 2011 at 3:57 pm #73901
  John Mercogliano
  Participant
  Tim,
  
  Did you check your monitord log or as part of your cycling do you do a
  
  Code: hcicmd -p hcimonitord -t d -c “cycle”
  
  When we first set up our automated monitoring solution we ran into the same problem because the queries to the hcimonitord was making the monitord log file grow too large.
  
  Something to look forward to in 5.7, we have automated cycling of log files now.
  
  Hope this helps,
  
  John Mercogliano
  Sentara Healthcare
  Hampton Roads, VA
- March 21, 2011 at 4:45 pm #73902
  Tim Gobbel
  Participant
  Thanx for all your replies. I will look into all of them in relation to what I have and try to get back to you all. Thanx!
- March 22, 2011 at 2:43 am #73903
  Donna Bailey
  Participant
  I had the same kind of issue awhile back….Cloverleaf support was pointing me toward a possible problem with a tclproc. We had two procs that were doing some complex work…..we put the threads into their own process and did alerts etc to handle the hang ups. We were able to remove those procs after a short period and didn’t have the hang ups any longer. Just a thought….
  
  Donna Bailey
  Tele: 315-729-3805
  dbailey@microstar.health
  Micro Star Inc.
- March 22, 2011 at 7:16 pm #73904
  Tim Gobbel
  Participant
  I will keep my eye open to that. We don’t use a lot of tclprocs and I am not aware of any that we changed prior to the hang. I still do not know which process it was but I suspect the ADT. There was one Xlate that I may have changed so I will go back and review that. Thanx!
Author

Replies

Viewing 5 reply threads

The forum ‘Cloverleaf’ is closed to new topics and replies.