We had a problem occur in production last week with a dblookup on a sqlite table. We have a database table built to hold newborn DOBs. When we have an ADT message going to a specific vendor, we use a tclproc to do a dblookup on this table to find the newborn’s DOB to add it to the HL7 message. Last week, messages started to fail and the message in the log file stated: “Failed to get DB connection”.
We have seen this before in test, but never production. Right before this started happening, we noticed a queue depth on this connection and so we restarted the thread. That’s when we started getting this error. But it gets stranger….for about 3 minutes until we stopped the process that this thread belonged in, nothing was actually going to the Error Database even though the log file said the messages were failing. Once we restarted the process, then the messages actually started going to the Error Database from that point forward, but we lost about 3 minutes worth of messages.
The only thing we know to do to correct this situation when it happens is to sync the table in cloverleaf and then restart the process. This has always cleared up the problem in test and corrected the problem last week in production.
I have two main questions:
(1) Does anyone know what causes this “Failed to get DB connection” error? Since it started when we stopped the thread that contains this tclproc on the outbound tab, we thought maybe it was trying to do a DB read at the time and got confused. Is there a way to keep that from happening? We stop threads all of the time to try to re-establish connections so if this causes a problem with a DB call, that could be an huge issue for us.
(2) For the 3 minutes that occurred between restarting the thread and restarting the process, messages were not being written to the Error DB. We’ve never seen this happen before. Could it be because we are using dblookup in a tclproc? Has anyone seen this happen before?
THANKS for any insight that you may have for us.
Error Message in Log file:
[msg :Tbl :ERR /0:cerner_z_adt_o:09/06/2019 15:04:24] Failed to get DB connection ‘newborn_dob’.
[sms :sms :ERR /0:cerner_z_adt_o:09/06/2019 15:04:24] Tcl error:
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] msgId = message0
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] proc = ‘tps_get_newborn_dob’
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] args = ”
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] result = ‘Failed to get DB connection’
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] errorInfo: ‘
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] Failed to get DB connection
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] while executing
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] “dblookup newborn_dob $corpId”
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] (procedure “tps_get_newborn_dob” line 44)
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] invoked from within
[sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] “tps_get_newborn_dob {MSGID message0} {CONTEXT sms_ob_data} {ARGS {}} {MODE run} {VERSION 3.0}”‘