Processing Badly Formatted HL7 Batch File

This topic has 3 replies, 3 voices, and was last updated 3 years, 3 months ago by Lonnie Davis.

Creator

Topic
April 21, 2022 at 4:47 pm #119673
Lonnie Davis
Participant
Hello all!

I have an issue I could use some advice on. Those in Texas who use ImmTrac to report vaccines through their unidirectional FTP interface might be able to relate.

The goal I am trying to accomplish is to download a file containing acknowledgment (ACK) messages from an SFTP server using an inbound fileset-ftp protocol thread. The source system, ImmTrac, makes the ACK messages available in a single file which can contain anywhere from a couple hundred messages at a few KB in size to several thousand messages at 1-2 MB.

There are a couple of problems with the file that I am unable to find a way to overcome:
- The file is a HL7 batch file containing FHS and BHS segments at the beginning of the file and a BTS and FTS segment at the end. These segments need to be removed.
- The source system does not use proper HL7 newline formatted messages. Each segment is terminated by a carriage return and newline. In hex, using hcihd, it appears as 0d0a and in Notepad++ it appears as CRLF
I’m looking for a way to remove the newline characters from the end of each HL7 segments, keep the newline before each MSH segment, and remove the FHS, BHS, BTS, and FTS segments before the file is received by Cloverleaf.

There are many different ways to accomplish this and I’m looking for suggestions, as well as example TCL code if available, on the best way to build this type of interface without impacting engine performance too much.

So far in development and testing using a small file, I can download and read in a file using the “single” format, then create a new HL7 message in an inbound TPS proc for each of the ACK messages in the file, but I hesitate to do this in production where the files can contain several thousand individual messages at a size of 1-2 MB.

My ideal solution is to download the file, process it, and make one file for each of the HL7 messages in the file to copy to a directory which Cloverleaf reads from using the fileset-local protocol, but I have no idea how to do that or if it is the best solution.

What do all think would be the best approach?
Creator

Topic

Viewing 2 reply threads

Author

Replies
- April 21, 2022 at 5:06 pm #119674
  Paul Bishop
  Participant
  I’m not sure if there’s an easy way to do this using the actual engine. For similar files, what we we have done in our AIX environment is to use a shell script that executes TCL programs to format the messages with the correct segment and message breaks, then just feed the formatted record through a fileset-local thread to be processed like any other HL7 message. The TCL program would also remove the file and batch header/trailer segments, although those can be handled using existing routing functions (involves a trxid UPOC).
  
  The TCL would check the first three characters of each line. If the value is FHS, BHS, BTS or FTS, that line would just be ignored. If it is MSH, you have a new message. Any other value at the beginning of a line would have the line appended to the current message being built, separated by the x0D. Every time you come to a line starting with MSH, write out the current message being built and start building a new message. The tricky part is handling the first MSH found (don’t need to write out a current message), and remembering to write out the last message you have built at the end of the file. Boolean flags can handle the first, and the last you just do at the end of the script before closing your output.
  
  Paul Bishop
  Carle Foundation Hospital
  Urbana, IL
- April 22, 2022 at 12:06 am #119675
  Charlie Bursell
  Participant
  I would read the entire file into the engine using filset and then process it in an IB proc
  
  It look like you want to process the message one at a time. Try something the below.
  I did not test it so there may be some fat finger errors. This is off the top of my head.
  
  run {
  # Get the message handle then the file
  set mh [keylget args MSGID]
  set file [msgget $mh]
  
  # Get segments in a list
  set seglist [split $file \n]
  
  # Remove FHS, BHS, BTS or FTS
  # Remove back to front so we do not skew list
  
  set loclist [lreverse [lsearch -all -regexp $lst {^(FHS)|(BHS)|(BTS)|(FTS)}]]
  foreach {loc $loclist} {lvarpop seglist $loc}
  
  # join messagesa with no LF
  set msglist [join $seglist \n]
  set msglist [string map
  $msglist
  
  # Put LF before each message
  set msglist [string map
  $msglist]
  
  # Finally, remove beginning LF and add lF to last
  set msglist [string trim $msglist \n]\n
  
  # Pass message one at a time through engine
  # To conserve memory get rid of IB file
  set dispList
  foreach {msg $msglist} {
  # Copy orignal metadata
  set nmh [msgcopy $mh]
  msgset nmh $msg
  lappend dispList “CONTINUE $nmh”
  }
  
  # Send to engine
  return $dispList
  }
- April 27, 2022 at 4:10 pm #119686
  Lonnie Davis
  Participant
  Thanks Paul and Charlie!
  
  After some extensive testing using both suggestions, the solution I chose to use was Charlie’s suggestion to read in the whole file using the “single” setting and create new messages from each one in the file. The load on the engine was not as big as I thought it would be doing it that way and even the larger multi-MB files were processed and sent to an outbound thread very quickly.
Author

Replies

Viewing 2 reply threads

You must be logged in to reply to this topic.