Processing Badly Formatted HL7 Batch File

Homepage Clovertech Forums Cloverleaf Processing Badly Formatted HL7 Batch File

  • Creator
    Topic
  • #119673
    Lonnie Davis
    Participant

    Hello all!

    I have an issue I could use some advice on.  Those in Texas who use ImmTrac to report vaccines through their unidirectional FTP interface might be able to relate.

    The goal I am trying to accomplish is to download a file containing acknowledgment (ACK) messages from an SFTP server using an inbound fileset-ftp protocol thread.  The source system, ImmTrac, makes the ACK messages available in a single file which can contain anywhere from a couple hundred messages at a few KB in size to several thousand messages at 1-2 MB.

    There are a couple of problems with the file that I am unable to find a way to overcome:

    • The file is a HL7 batch file containing FHS and BHS segments at the beginning of the file and a BTS and FTS segment at the end.  These segments need to be removed.
    • The source system does not use proper HL7 newline formatted messages.  Each segment is terminated by a carriage return and newline.  In hex, using hcihd, it appears as 0d0a and in Notepad++ it appears as CRLF

    I’m looking for a way to remove the newline characters from the end of each HL7 segments, keep the newline before each MSH segment, and remove the FHS, BHS, BTS, and FTS segments before the file is received by Cloverleaf.

    There are many different ways to accomplish this and I’m looking for suggestions, as well as example TCL code if available, on the best way to build this type of interface without impacting engine performance too much.

    So far in development and testing using a small file, I can download and read in a file using the “single” format, then create a new HL7 message in an inbound TPS proc for each of the ACK messages in the file, but I hesitate to do this in production where the files can contain several thousand individual messages at a size of 1-2 MB.

    My ideal solution is to download the file, process it, and make one file for each of the HL7 messages in the file to copy to a directory which Cloverleaf reads from using the fileset-local protocol, but I have no idea how to do that or if it is the best solution.

    What do all think would be the best approach?

Viewing 2 reply threads
  • Author
    Replies
    • #119674
      Paul Bishop
      Participant

      I’m not sure if there’s an easy way to do this using the actual engine.  For similar files, what we we have done in our AIX environment is to use a shell script that executes TCL programs to format the messages with the correct segment and message breaks, then just feed the formatted record through a fileset-local thread to be processed like any other HL7 message.  The TCL program would also remove the file and batch header/trailer segments, although those can be handled using existing routing functions (involves a trxid UPOC).

      The TCL  would check the first three characters of each line.  If the value is FHS, BHS, BTS or FTS, that line would just be ignored.  If it is MSH, you have a new message.  Any other value at the beginning of a line would have the line appended to the current message being built, separated by the x0D.  Every time you come to a line starting with MSH, write out the current message being built and start building a new message.  The tricky part is handling the first MSH found (don’t need to write out a current message), and remembering to write out the last message you have built at the end of the file.  Boolean flags can handle the first, and the last you just do at the end of the script before closing your output.

      Paul Bishop
      Carle Foundation Hospital
      Urbana, IL

    • #119675
      Charlie Bursell
      Participant

      I would read the entire file into the engine using filset and then process it in an IB proc

      It look like you want to process the message one at a time.  Try something the below.
      I did not test it so there may be some fat finger errors.  This is off the top of my head.

      run {
      # Get the message handle then the file
      set mh [keylget args MSGID]
      set file [msgget $mh]

      # Get segments in a list
      set seglist [split $file \n]

      # Remove FHS, BHS, BTS or FTS
      # Remove back to front so we do not skew list

      set loclist [lreverse [lsearch -all -regexp $lst {^(FHS)|(BHS)|(BTS)|(FTS)}]]
      foreach {loc $loclist} {lvarpop seglist $loc}

      # join messagesa with no LF
      set msglist [join $seglist \n]
      set msglist [string map

        $msglist

        # Put LF before each message
        set msglist [string map

          $msglist]

          # Finally, remove beginning LF and add lF to last
          set msglist [string trim $msglist \n]\n

          # Pass message one at a time through engine
          # To conserve memory get rid of IB file
          set dispList

            foreach {msg $msglist} {
            # Copy orignal metadata
            set nmh [msgcopy $mh]
            msgset nmh $msg
            lappend dispList “CONTINUE $nmh”
            }

            # Send to engine
            return $dispList
            }

          1. #119686
            Lonnie Davis
            Participant

            Thanks Paul and Charlie!

            After some extensive testing using both suggestions, the solution I chose to use was Charlie’s suggestion to read in the whole file using the “single” setting and create new messages from each one in the file.  The load on the engine was not as big as I thought it would be doing it that way and even the larger multi-MB files were processed and sent to an outbound thread very quickly.

        Viewing 2 reply threads
        • You must be logged in to reply to this topic.

        Forum Statistics

        Registered Users
        5,117
        Forums
        28
        Topics
        9,292
        Replies
        34,435
        Topic Tags
        286
        Empty Topic Tags
        10