Failed to get DB connection

Homepage Clovertech Forums Cloverleaf Failed to get DB connection

Tagged: , ,

  • Creator
    Topic
  • #112255
    Marla Fallin
    Participant

    We had a problem occur in production last week with a dblookup on a sqlite table.   We have a database table built to hold newborn DOBs.   When we have an ADT message going to a specific vendor, we use a tclproc to do a dblookup on this table to find the newborn’s DOB to add it to the HL7 message.   Last week, messages started to fail and the message in the log file stated: “Failed to get DB connection”.

    We have seen this before in test, but never production.  Right before this started happening, we noticed a queue depth on this connection and so we restarted the thread.    That’s when we started getting this error.  But it gets stranger….for about 3 minutes until we stopped the process that this thread belonged in, nothing was actually going to the Error Database even though the log file said the messages were failing.   Once we restarted the process, then the messages actually started going to the Error Database from that point forward, but we lost about 3 minutes worth of messages.

    The only thing we know to do to correct this situation when it happens is to sync the table in cloverleaf and then restart the process.  This has always cleared up the problem in test and corrected the problem last week in production.

    I have two main questions:

    (1)  Does anyone know what causes this “Failed to get DB connection” error?   Since it started when we stopped the thread that contains this tclproc on the outbound tab, we thought maybe it was trying to do a DB read at the time and got confused.  Is there a way to keep that from happening?    We stop threads all of the time to try to re-establish connections so if this causes a problem with a DB call, that could be an huge issue for us.

    (2)  For the 3 minutes that occurred between restarting the thread and restarting the process, messages were not being written to the Error DB.  We’ve never seen this happen before.  Could it be because we are using dblookup in a tclproc?   Has anyone seen this happen before?

    THANKS for any insight that you may have for us.

     

     

    Error Message in Log file:

    [msg :Tbl :ERR /0:cerner_z_adt_o:09/06/2019 15:04:24] Failed to get DB connection ‘newborn_dob’.
    [sms :sms :ERR /0:cerner_z_adt_o:09/06/2019 15:04:24] Tcl error:
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] msgId = message0
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] proc = ‘tps_get_newborn_dob’
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] args = ”
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] result = ‘Failed to get DB connection’
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] errorInfo: ‘
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] Failed to get DB connection
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] while executing
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] “dblookup newborn_dob $corpId”
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] (procedure “tps_get_newborn_dob” line 44)
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] invoked from within
    [sms :sms :ERR /0:cerner_z_adt_o:–/–/—- –:–:–] “tps_get_newborn_dob {MSGID message0} {CONTEXT sms_ob_data} {ARGS {}} {MODE run} {VERSION 3.0}”‘

Viewing 2 reply threads
  • Author
    Replies
    • #112256
      Charlie Bursell
      Participant

      Maybe messages running too fast such that database is not ready for next message?

      Are you opening and closing the database with each read?  When you get the error, do you attempt recovery via the Tcl proc?  You should.  When you get the error go to a routine that will stop and flush database and then reopen.  If you cannot recover you should stop the process and send notice to someone.

      The reson nothing goes to error database is this is not a Tcl Error but a database error trapped in Tcl.  If you want them to go to error database return an ERROR disposition when you get the error.

      When dealing with databases remember anything that can go wrong usually will. 🙂

    • #113070
      Anthony Kirk
      Participant

      It is true that performance improves when you open the database at startup and close it at shutdown.  A database can be closed, reopened and your action upon it repeated upon an error (so long as you catch it).  For my SQL server connection I add a 5-second sleep into the process – we could all do with a power nap sometimes!  You can repeat that x number of times until you give up.  At that point you should be deciding whether it is safe to continue the message you have without the additional data, or change the disposition like Charlie said.

       

      This is just a hunch but are you capturing the dates of birth of all neonates ad infinitum?  Your database will be significantly smaller if you remove entries beyond day 28.

    • #113243
      Rob Lindsey
      Participant

      Interesting situation.  I would not have used a dblookup on an SQLite table but would just use the standard TCL library and do my selects directly from the table.  See an example of where we have a TCL that is reading directly from a SQLite DB file with a table.  We also update the table after reading it.

      The only time that we have used dblookup is when the Database is on another system and is a different DB other than SQLite.
      <pre><example code>
      package require sqlite
      set dbName “$HciRoot/databases/stvin_hl7_adt.db”
      sqlite DBCMD $dbName
      DBCMD timeout 10000</pre>
      <pre>set sqlcmd “select id,hl7msg from stvin_hl7_adt_msgs where readintoengine is null order by id limit 135;”
      catch {unset x}
      DBCMD eval $sqlcmd x {
      set id $x(id)
      set record $x(hl7msg)
      }
      </example code></pre>
      Rob

Viewing 2 reply threads
  • You must be logged in to reply to this topic.

Forum Statistics

Registered Users
5,129
Forums
28
Topics
9,301
Replies
34,447
Topic Tags
288
Empty Topic Tags
10