Processing Badly Formatted HL7 Batch File

Clovertech Forums Cloverleaf Processing Badly Formatted HL7 Batch File

  • Creator
    Topic
  • #119673
    Lonnie Davis
    Participant

      Hello all!

      I have an issue I could use some advice on.  Those in Texas who use ImmTrac to report vaccines through their unidirectional FTP interface might be able to relate.

      The goal I am trying to accomplish is to download a file containing acknowledgment (ACK) messages from an SFTP server using an inbound fileset-ftp protocol thread.  The source system, ImmTrac, makes the ACK messages available in a single file which can contain anywhere from a couple hundred messages at a few KB in size to several thousand messages at 1-2 MB.

      There are a couple of problems with the file that I am unable to find a way to overcome:

      • The file is a HL7 batch file containing FHS and BHS segments at the beginning of the file and a BTS and FTS segment at the end.  These segments need to be removed.
      • The source system does not use proper HL7 newline formatted messages.  Each segment is terminated by a carriage return and newline.  In hex, using hcihd, it appears as 0d0a and in Notepad++ it appears as CRLF

      I’m looking for a way to remove the newline characters from the end of each HL7 segments, keep the newline before each MSH segment, and remove the FHS, BHS, BTS, and FTS segments before the file is received by Cloverleaf.

      There are many different ways to accomplish this and I’m looking for suggestions, as well as example TCL code if available, on the best way to build this type of interface without impacting engine performance too much.

      So far in development and testing using a small file, I can download and read in a file using the “single” format, then create a new HL7 message in an inbound TPS proc for each of the ACK messages in the file, but I hesitate to do this in production where the files can contain several thousand individual messages at a size of 1-2 MB.

      My ideal solution is to download the file, process it, and make one file for each of the HL7 messages in the file to copy to a directory which Cloverleaf reads from using the fileset-local protocol, but I have no idea how to do that or if it is the best solution.

      What do all think would be the best approach?

    Viewing 2 reply threads
    • Author
      Replies
      • #119674
        Paul Bishop
        Participant

          I’m not sure if there’s an easy way to do this using the actual engine.  For similar files, what we we have done in our AIX environment is to use a shell script that executes TCL programs to format the messages with the correct segment and message breaks, then just feed the formatted record through a fileset-local thread to be processed like any other HL7 message.  The TCL program would also remove the file and batch header/trailer segments, although those can be handled using existing routing functions (involves a trxid UPOC).

          The TCL  would check the first three characters of each line.  If the value is FHS, BHS, BTS or FTS, that line would just be ignored.  If it is MSH, you have a new message.  Any other value at the beginning of a line would have the line appended to the current message being built, separated by the x0D.  Every time you come to a line starting with MSH, write out the current message being built and start building a new message.  The tricky part is handling the first MSH found (don’t need to write out a current message), and remembering to write out the last message you have built at the end of the file.  Boolean flags can handle the first, and the last you just do at the end of the script before closing your output.

          Paul Bishop
          Carle Foundation Hospital
          Urbana, IL

        • #119675
          Charlie Bursell
          Participant

            I would read the entire file into the engine using filset and then process it in an IB proc

            It look like you want to process the message one at a time.  Try something the below.
            I did not test it so there may be some fat finger errors.  This is off the top of my head.

            run {
            # Get the message handle then the file
            set mh [keylget args MSGID]
            set file [msgget $mh]

            # Get segments in a list
            set seglist [split $file \n]

            # Remove FHS, BHS, BTS or FTS
            # Remove back to front so we do not skew list

            set loclist [lreverse [lsearch -all -regexp $lst {^(FHS)|(BHS)|(BTS)|(FTS)}]]
            foreach {loc $loclist} {lvarpop seglist $loc}

            # join messagesa with no LF
            set msglist [join $seglist \n]
            set msglist [string map

              $msglist

              # Put LF before each message
              set msglist [string map

                $msglist]

                # Finally, remove beginning LF and add lF to last
                set msglist [string trim $msglist \n]\n

                # Pass message one at a time through engine
                # To conserve memory get rid of IB file
                set dispList

                  foreach {msg $msglist} {
                  # Copy orignal metadata
                  set nmh [msgcopy $mh]
                  msgset nmh $msg
                  lappend dispList “CONTINUE $nmh”
                  }

                  # Send to engine
                  return $dispList
                  }

                1. #119686
                  Lonnie Davis
                  Participant

                    Thanks Paul and Charlie!

                    After some extensive testing using both suggestions, the solution I chose to use was Charlie’s suggestion to read in the whole file using the “single” setting and create new messages from each one in the file.  The load on the engine was not as big as I thought it would be doing it that way and even the larger multi-MB files were processed and sent to an outbound thread very quickly.

                Viewing 2 reply threads
                • You must be logged in to reply to this topic.