Inbound PDF to base64-encoded HL7

Homepage Clovertech Forums Cloverleaf Inbound PDF to base64-encoded HL7

Tagged: ,

  • Creator
    Topic
  • #118608
    Timothy O’Donnell
    Participant

    Good afternoon! I’m working on an interface to process PDFs, base64 encode and HL7-embed that information and I’m running into issues. On Cloverleaf 6.1.4.

    VisualCron can drop a PDF file to Cloverleaf for me, and I want to use a fileset-local inbound to parse the PDF, pull the file name (which will have the patient identifier like V0000011.pdf,) base64-encode the PDF and pass both that encoded string and the file name to an xlate that would build an ORU, using SQL queries to grab more data on the patient and drop the encoded PDF string into OBX.5.

    I’ve written a dirParse tcl to process only PDF files and an archive tcl to copy the PDF off to an archive as needed. I have a TPS to encode the PDF and pass that and the file name forward to the xlate. Then I have the xlate on a route between the fileset-local and another file thread (just for testing for now) with a VRL inbound and HL7 2.4 ORU outbound. The issue I’m having is that the data that is being passed from the fileset-local inbound is just thousands of characters, which I assume is the PDF being parsed as text, so I’m getting dozens of outbound ORUs with various strings in the OBX.5. Clearly I’m doing something wrong but I can’t for the life of me find anything on this forum that clearly indicates how to setup something like this.

    I think I need to have the inbound tcl on the Trx ID Determination Format: UPOC instead of TPS Inbound Data but no matter what I do, I can’t get the filename from DRIVERCTL and I can’t get just the encoded PDF string and the filename sent forward to the xlate instead of all those random PDF characters. Any help would be appreciate and I can post some tcl if need be but it’s super rudimentary so I’m willing to start from scratch if needed. Thanks!

    -Timothy

Viewing 6 reply threads
  • Author
    Replies
    • #118610
      Jim Kosloskey
      Participant

      What is the Style you have associated with the Fileset Protocol?

      If you expect just one PDF per file, then use a style like ‘single’ which will treat the entire file as a single message.

      You could put the file name in the USERDATA message metadata (don’t forget to use a keyed list) then your VRL would only have one field (unlimited length) – that being the Base64 encoded PDF. But that is just something different and should not have caused you the issue you are seeing. I just like to have a PDF defined as a single field VRL whenever possible.

      Let us know if you are already using the ‘single’ style.

      email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

      • #118612
        Timothy O’Donnell
        Participant

        Jim,

        I didn’t have the Style set to “Single” so I have changed that and bounced the process.

        For the Directory Parse, I’m simply saying “PDF only” and then I have a tps on TPS Inbound Data (code snippet below) to encode the PDF:

        run {
        # ‘run’ mode always has a MSGID; fetch and process it

        keylget args MSGID mh
        package require base64

        set msg [msgget $mh]
        fconfigure $msg -translation binary
        set encodedPDF [base64::encode [read -nonewline $msg]]
        lappend new_msg $encodedPDF

        msgset $mh $new_msg
        lappend dispList “continue $mh”

        }

        Now I’m getting this ERR: can not find channel named “%PDF-1.7 so I’m not sure what I should be writing for this TPS Inbound Data tcl in order to pass the appropriate data along to the xlate.

        -Timothy

      • #118613
        Jim Kosloskey
        Participant

        There should be no need for you to read the file again. The file has already been read at the point your proc is being invoked and what is in your msg variable should be the PDF. Then just Base64 encode $msg.

        email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

      • #118614
        Timothy O’Donnell
        Participant

        Jim,

        I removed that unnecessary read but the value being passed to me, instead of the actual PDF data, is just the string “%PDF-1.7” instead of the actual PDF that I’m feeding it as a test.

        This is the relevant code snippet for my dirParse:

        run {
        # ‘run’ mode always has a MSGID; fetch and process it

        keylget args MSGID mh

        # add double quotes around filename with spaces
        #set files [string map {“\x0d\x0a” “\x01”} [msgget $mh]]

        set files [msgget $mh]
        set newlist {}
        foreach f $files {
        if ![regexp — {.*\.pdf} $f] {
        echo Skip $f
        continue
        }
        lappend newlist $f
        }
        msgset $mh $newlist
        lappend dispList “CONTINUE $mh”
        }

        And this is the relevant code snippet for my TPS Inbound Data tcl:

        run {
        # ‘run’ mode always has a MSGID; fetch and process it

        keylget args MSGID mh
        package require base64

        set msg [msgget $mh]
        echo “This is the Message: $msg”
        fconfigure $msg -translation binary
        set encodedPDF [base64::encode $msg]
        echo “The encoded PDF is $encodedPDF”
        lappend new_msg $encodedPDF

        msgset $mh $new_msg
        lappend dispList “CONTINUE $mh”

        }

         

      • #118615
        Jim Kosloskey
        Participant

        I don’t think you need the fconfigure either. At this point there is no file just a message (hopefully) from the file

        email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

      • #118616
        Timothy O’Donnell
        Participant

        Jim,

        I removed that fconfigure as well. I’m still getting just the string “%PDF-1.7” so I’m wondering if you know what would cause that to be passed from the Directory Parse tcl.

        -Timothy

      • #118617
        Jim Kosloskey
        Participant

        The dirparse UPoC proc receives a list of files Cloverleaf has located in the specified directory. The proc then alters that list to only include the names of the files one wants and returns that list.

        At this point NO files have actually been read.

        Once the DirParse UPoC returns control to the engine, the protocol Opens and Reads each file in the modified list one at a time.

        Each read (based on style) returns a Message Handle to the IB TPS UPoC. which is the current message found in the current read of the current file.

        In your case. the msg variable contains the message as read by the Fileset Local protocol (based on the style).

        So now when you test if you have the EO turned up, the process log should have the message as read by the protocol. If all of the Protocol settings are correct, that should be your PDF.

        I am assuming you are testing with just one file (and hopefully a small PDF) to get this working. If that is the case, you should see your PDF message from the file in the log.

        If you would like to take this off-line, email me and I will try to assist.

        email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

    • #118611
      Jim Kosloskey
      Participant

      Oh and please don’t forget to use the appropriate components of OBX-5 to identify and contain the Embedded Data as well as setting OBX-2 to the proper value (ED).

      email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

    • #118639
      Robert Kersemakers
      Participant

      Oh, I have done this.
      As Jim said: you shouldn’t use DirParse as it only reads/lists the files found and does not read the actual files.

      You will need to make a UPoC that reads the filename and processes this file as you would like it to. What I did: read the filename, copy the file itself to the place specified (as parameter of the script) then use the filename and turn it into a (VRL) message. This message is sent on to be translated by an xlate. This way I have a generic script to process several different files.

      Zuyderland Medisch Centrum; Heerlen/Sittard; The Netherlands

    • #118676
      Timothy O’Donnell
      Participant

      For anyone who comes across this thread in the future looking to do something similar, I finally got this to work with the help of everyone on this thread:

      Setup Inbound thread as fileset-local with Style: single (Important if each file is a single non-base64 encoded PDF!)

      I’m using a dirParse but just to skip any non-PDF file. Doesn’t do anything else. I also have an Archive tcl for Deletion to just copy the PDF to another folder for testing purposes. This won’t be in Production.

      I have a tcl on TPS Inbound Data for the base64 encode that returns the file name (without extension) and the base64 encoded PDF with the data separated by commas. That data is then passed to the xlate which has uses an inbound VRL with two fields – file name and embedded PDF – then I build the HL7 from there with the necessary data. Here’s the run from my TCL, your mileage may vary depending on what you need to do with the information. At the very least, this could be a good jumping off point.

      run {
      # ‘run’ mode always has a MSGID; fetch and process it

      keylget args MSGID mh
      package require base64
      set msg [msgget $mh]
      set filepath {}
      set drvCtl [msgmetaget $mh DRIVERCTL]
      keylget drvCtl FILENAME filepath
      set filename [file rootname [file tail $filepath]]
      lappend new_msg $filename
      set encodedPDF [base64::encode -maxlen 0 $msg]
      lappend new_msg $encodedPDF

      msgset $mh [join $new_msg “,”]
      lappend dispList “CONTINUE $mh”
      }

      Hope this helps!

      -Timothy

      • #119039
        Timothy O’Donnell
        Participant

        UPDATE: After being delayed for a few months, we finally went live with this project only to find out that the process I outlined above didn’t work exactly as planned. The overall concept worked but the PDF was blank – correct title and number of pages, but no content. We didn’t run into this when first testing but we also tested with smaller-sized PDFs and I relied on the vendor to confirm the PDFs were valid. Lessons learned.

        I decided to reassess how I set this up and came up with a solution that actually worked. The setup is roughly the same as before but on my fileset-local inbound thread, on the Inbound tab > TPS Inbound Data, I changed the tcl to only pull the filename of the PDF (which is a patient identifier in this case) and pass that along to the xlate. The VRL I changed to have one field – the filename – instead of before where it had two – one for filename and one for the base64-encoded PDF string.

        I also updated the Archive tcl on the fileset-local inbound thread copying the PDF to a “staging” folder on the CL server.

        The xlate builds the HL7, SQL-querying with the filename to build the patient demographics out as required by the vendor. This is the same as it was before. The difference is I now have a tcl snippet taking in the filename value, which lets me open the PDF that I put in the “staging” folder from the Archive tcl and then I do the base64 encoding here, using fconfigure to translate the PDF to binary first. Outbound variable now was the base64 string and then I delete the copied file that I put in the “staging” folder as it’s no longer needed. I’ve put the relevant tcl snippet below.

        package require base64
        set visitNumber $xlateInVals
        set filePath “$HciRoot/data/UTF/Staging/$visitNumber.PDF”
        set fileHandle [open $filePath r]
        fconfigure $fileHandle -translation binary
        set encodedPDF [string map {\n “”} [base64::encode [read $fileHandle [file size $filePath]]]]
        close $fileHandle
        set xlateOutVals $encodedPDF
        file delete -force “$HciRoot/data/UTF/Staging/$visitNumber.PDF”

        This may not be the most elegant or efficient way to do this, but given the size and scope of this interface – 20-30 600KB PDFs dropped overnight during non-peak hours for an interface that doesn’t require a huge amount of patient demographics in the HL7 – it works swimmingly. I also found a base64 encode/decode site is a huge help for quickly identifying if my base64-encoded PDF string was correct or not. Definitely recommend for speedy confirmation.

        -Timothy

    • #118677
      Jim Kosloskey
      Participant

      I am glad you got this to work.

      Thanks for sharing. I am sure this will be helpful to others.

      email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

    • #118678
      Jim Kosloskey
      Participant

      I am curious – where do you get the demographic information for the HL/7 message (at least a patient identifier)? Or does the receiving system not care?

      email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

      • #118679
        Timothy O’Donnell
        Participant

        The PDF filename will always be the Patient Visit Number from our EMR and then I have an Advanced Database Lookup to our EMR SQL server in the XLATE to pull the minimum necessary demographics like Name, DOB, Gender, Department, etc. based on that Visit Number to fill in the gaps in the HL7. The volume for this interface is likely to be relatively small and controlled – files will always be dropped at a specified time and not in large quantities – so the combination of PDF file sizes and SQL queries shouldn’t be too taxing. If the overall volume was larger, I’d probably want the patient demographics in file metadata or in the file name as well.

        -Timothy

    • #118680
      Jim Kosloskey
      Participant

      Excellent! And you can use the Filest Protocol pacing parameters to control the arrival rate in order to smooth things out.

      email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

Viewing 6 reply threads
  • You must be logged in to reply to this topic.

Forum Statistics

Registered Users
5,117
Forums
28
Topics
9,292
Replies
34,432
Topic Tags
286
Empty Topic Tags
10