WARNING: engine terminating due to disk space shortage

Tagged: epic, rtf

This topic has 8 replies, 5 voices, and was last updated 3 years, 4 months ago by Matthew Rasmussen.

Creator

Topic
February 9, 2021 at 3:36 pm #118490
Jeff Dawson
Participant
Never seen this the type of error before, process crashed which receives RTF documents from Epic, which uses tcp/ip multi-server configuration. The Warning in the log seems to be misleading, I checked our disc space and its not close to be full in PRD. Has anyone come across this type of error before?

Running CIS 6.2.6.1 on AIX 7.2 TL2

02/09/2021 14:00:02
WARNING: engine terminating due to disk space shortage
[cmd :cmd :INFO/0:Fepic_rtf_cmd:02/09/2021 14:00:02] Receiving a command
[prod:prod:INFO/0:Fepic_rtf_xlate:02/09/2021 14:00:02] Checking for leaked handles in the Xlate interpreter…
[tcl :out :INFO/0:Fepic_rtf_xlate:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0:Fepic_rtf_xlate:02/09/2021 14:00:02]
[prod:prod:INFO/0:Fepic_rtf_xlate:02/09/2021 14:00:02] Checking for leaked handles in the General interpreter…
[tcl :out :INFO/0:Fepic_rtf_xlate:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0:Fepic_rtf_xlate:02/09/2021 14:00:02]
[prod:prod:INFO/0:TSAA_tran_rtf:02/09/2021 14:00:02] Checking for leaked handles in the General interpreter…
[tcl :out :INFO/0:TSAA_tran_rtf:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0:TSAA_tran_rtf:02/09/2021 14:00:02]
[prod:prod:INFO/0:TSAA_tran_rtf:02/09/2021 14:00:02] Checking for leaked handles in the TPS interpreter…
[tcl :out :INFO/0:TSAA_tran_rtf:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0:TSAA_tran_rtf:02/09/2021 14:00:02]
[prod:prod:INFO/0: TSO_rtf_T:02/09/2021 14:00:02] Checking for leaked handles in the General interpreter…
[tcl :out :INFO/0: TSO_rtf_T:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0: TSO_rtf_T:02/09/2021 14:00:02]
[prod:prod:INFO/0: TSO_rtf_T:02/09/2021 14:00:02] Checking for leaked handles in the TPS interpreter…
[tcl :out :INFO/0: TSO_rtf_T:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0: TSO_rtf_T:02/09/2021 14:00:02]
[prod:prod:INFO/0:TSO_trans_rtf:02/09/2021 14:00:02] Checking for leaked handles in the General interpreter…
[tcl :out :INFO/0:TSO_trans_rtf:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0:TSO_trans_rtf:02/09/2021 14:00:02]
[prod:prod:INFO/0:TSO_trans_rtf:02/09/2021 14:00:02] Checking for leaked handles in the TPS interpreter…
[tcl :out :INFO/0:TSO_trans_rtf:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0:TSO_trans_rtf:02/09/2021 14:00:02]
[prod:prod:INFO/0: FEtrans_rtf:02/09/2021 14:00:02] Checking for leaked handles in the General interpreter…
[tcl :out :INFO/0: FEtrans_rtf:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0: FEtrans_rtf:02/09/2021 14:00:02]
[prod:prod:INFO/0: FEtrans_rtf:02/09/2021 14:00:02] Checking for leaked handles in the TPS interpreter…
[tcl :out :INFO/0: FEtrans_rtf:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0: FEtrans_rtf:02/09/2021 14:00:02]
[cmd :cmd :INFO/0:Fepic_rtf_cmd:02/09/2021 14:00:02] Shutting down command thread Fepic_rtf_cmd.
[prod:prod:INFO/0:Fepic_rtf_cmd:02/09/2021 14:00:02] Checking for leaked handles in the NCI interpreter…
[tcl :out :INFO/0:Fepic_rtf_cmd:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0:Fepic_rtf_cmd:02/09/2021 14:00:02]
[prod:prod:INFO/0:Fepic_rtf_cmd:02/09/2021 14:00:02] Checking for leaked handles in the General interpreter…
[tcl :out :INFO/0:Fepic_rtf_cmd:02/09/2021 14:00:02] <No active handles>
[tcl :out :INFO/0:Fepic_rtf_cmd:02/09/2021 14:00:02]
[cmd :cmd :INFO/0:Fepic_rtf_cmd:02/09/2021 14:00:02] Removing engine command port file.
[prod:prod:INFO/0: STARTUP_TID:02/09/2021 14:00:02] Removing pid file.
[prod:prod:INFO/0: STARTUP_TID:02/09/2021 14:00:02] Engine process 31458002 is terminating
[prod:prod:INFO/0: STARTUP_TID:02/09/2021 14:00:02] Ended at Tue Feb 9 14:00:02 2021
- This topic was modified 4 years, 6 months ago by Jeff Dawson.
Creator

Topic

Viewing 5 reply threads

Author

Replies
- February 9, 2021 at 3:42 pm #118492
  Jerry Sawa
  Participant
  Hi Jeff. Sorry, don’t mean to sound condescending, but when you’re checking the disk space, you’re checking for that specific volume/file system, right? Not the disk space for the entire server?
  - February 10, 2021 at 12:54 pm #118496
    Jeff Dawson
    Participant
    Hi Jerry,
    
    No offense taken and appreciate the thought, I checked all our file systems on our AIX server, the FS in question is listed below the output of the df -g command.
    
    <td width=”74″>Filesystem<td width=”66″>GB blocks<td width=”35″>Free<td width=”64″>%Used<td width=”64″>Iused<td width=”64″>%Iused<td width=”86″> Mounted on
    
    /dev/fslv01 610 488 20% 258715 1% /cis
    
    Hmmm Take two on a better formatted output
    
    Filesystem – /dev/fslv01
    
    GB blocks – 610
    
    Free – 488.31
    
    %Used – 20%
    
    Iused – 258715
    
    %Iused – 1%
    
    Mounted on – /cis
    
    This reply was modified 4 years, 5 months ago by Jeff Dawson.
- February 9, 2021 at 3:50 pm #118493
  Jay Hammond
  Participant
  I don’t remember that error exactly, but I know that our Epic (between versions 2017 and November 2020) had some pretty nasty issues with RTF document sizes (Wound Care I believe with embedded pics) that would time out between Epic and Cloverleaf. You might check the size of the file being sent to make sure that it isn’t at least part of the issue you’re seeing.
  - February 10, 2021 at 1:02 pm #118498
    Jeff Dawson
    Participant
    Jay that’s an excellent thought, i’ll check back to the time the process crashed and see if there are any file sizes that seem out of the ordinary. One other note the thread receiving this data isn’t doing any type of translation or major processing.
- February 10, 2021 at 1:39 pm #118499
  Jeff Dawson
  Participant
  It looks like that is the prime suspect now, the SMAT file for that time period is 1.7 GB. Looking at the RTF messages during that time frame i see a steady flow of 27kb sized messages then we get a few 2.2MB RTF hl7 messages before the crash occurred. The question I have now is there any type of setting that could possible increase the memory of a process if this is hitting a thresh hold. At this point I’m going to open a support ticket as well since this occurred in our Production environment and see if they have any suggestions too. I’ll make sure to update with any findings.
- February 10, 2021 at 4:02 pm #118501
  Jeff Dawson
  Participant
  Here is Infor supports reply, just a note our shell is ksh so the bash command also works in our .profile file located under /home/hci/.profile.
  
  “Doing some research, it seems this is a known bug.
  
  The official fix is scheduled for CIS 20.1 with AR24896. In the meantime, this should get you through it and back to normal functionality…
  
  We can add an env var to the .profile for hci
  
  for csh, set it with command “setenv IGNORE_FSFULL 1”
  for bash, with the command “export IGNORE_FSFULL=1″
  
  Add the above line, whichever works for your system, to the .profile for hci.
  Open a new terminal and login as hci.
  Type: echo $IGNORE_FSFULL, if it returns a value, it is set, restart your host and engines. If it does not return a value, use the other style entry.”
  
  Service Team indicated the following:
  
  I would stop and restart the process that it affects s, so it picks up the env variable.
  Also, you have to restart the hostserver with the env var set.
- February 25, 2021 at 5:51 pm #118544
  Matthew Brophy
  Participant
  fwiw – we ran into this issue a few years ago (cis6.1.2) with large documents ending up crashing the site the processes our mdms.
  
  The killer is loading the messages into recovery db (x2), processing the messages while also smatting. What we ended up doing was setting a GUI alert to shut down the incoming threads that feed our document repository (OnBase) if the outbound queue is greater than 200 msgs. This RARELY happens, but its a life saver because the alternative is crashing the site and potentially losing thousands of message if they aren’t saved off appropriately.
  
  We have a two dozen receive threads that feed this OnBase connection so if a system started flooding us with MDMs/PDFs we could prevent a system crash
- April 7, 2022 at 1:42 pm #119666
  Matthew Rasmussen
  Participant
  Hey, so we had the same issue recently. I’m going to copy the errors that we witnessed for search indexing, as Jim had a hard time finding this post. But first, here’s an updated response from Infor for the particular issue we were seeing:
  
  The following solution might help to answer questions about the incident or resolve the issue associated with it: 2153485
  
  Event Note: Matthew,
  Can we please try the following?
  For the hci profile please modify the profile.local.end and change the following entries
  From
  export IGNORE_MEMCHECK=1
  export IGNORE_FSFULL=1
  export MALLOC_CHECK=0
  
  to
  
  #export IGNORE_MEMCHECK=1
  #export IGNORE_FSFULL=1
  #export MALLOC_CHECK=0
  
  Here are the errors we found in our logs:
  
  WARNING: engine terminating due to disk space shortage
  [pd :open:ERR /0:zzz_a360_ib_rte:04/06/2022 12:32:56] [0.0.354731441] Unable to complete inbound save due to full disk, engine is terminating.
  
  [pd :open:ERR /0:zzz_ib_EpicHMS_ADT:04/06/2022 12:40:59] Initializing secondary for [THREAD].in when not present
  [pd :open:ERR /0:zzz_ib_EpicHMS_ADT:04/06/2022 12:48:38] [0.0.761496171] Unable to complete inbound save due to full disk, engine is terminating.
  [pd :open:ERR /0:zzz_ib_EpicHMS_ADT:04/06/2022 12:48:38] Attempting to start an invalid transaction in [SMATDB file]
  [pd :open:ERR /0:zzz_ib_EpicHMS_ADT:04/06/2022 12:48:38] Unable to start transaction while wrtiting to db [SMATDB file]
  [pd :open:ERR /0:zzz_ib_EpicHMS_ADT:04/06/2022 12:48:38] Initializing secondary for [SMATDB file] when not present
  
  [dbi :dbi :ERR /0:EPIC_945504_IN:04/06/2022 12:26:32] dbiWriteLogMsg: database (/Cloverleaf/cis19.1/integrator/[siteName]/exec/databases/) disk is > 10240 kilobytes full – engine terminating! err:0
  [dbi :dbi :WARN/0:EPIC_945504_IN:04/06/2022 12:26:32] [0.0.294414534] dbiWriteMsgToRecoveryDb: failed inserting a recovery db record; try again
  [dbi :dbi :WARN/0:EPIC_945504_IN:04/06/2022 12:26:32] [0.0.294414534] dbiWriteMsgToRecoveryDb: failed inserting a recovery db record; try again
  [dbi :dbi :ERR /0:EPIC_945504_IN:04/06/2022 12:26:32] [0.0.294414534] dbiWriteMsgToRecoveryDb: failed inserting a recovery db record; err 4
Author

Replies

Viewing 5 reply threads

You must be logged in to reply to this topic.