Has anyone parsed the contents of a PDF document using TCL?

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Has anyone parsed the contents of a PDF document using TCL?

  • Creator
    Topic
  • #55778
    David Coffey
    Participant

      Actually I do know of a shop that has done it to extract PHI to build the HL7 on the fly, I am hoping they respond.  Anyone else?  How is this done?  

      David Coffey

    Viewing 2 reply threads
    • Author
      Replies
      • #86483
        David Barr
        Participant

          Install poppler-utils on Redhat. This includes a utility called “pdftotext”. You can write your message out to a file, then you can exec the pdftotext utility from TCL and read the output back in. You’ll have to parse the results based on the format of the data in the PDF and convert it to HL7. It usually helps to run “pdftotext -layout”. This makes the output correspond to the order of items on the page rather than the order they appear in the PDF file (they can differ).

        • #86484
          David Coffey
          Participant

            I’m sorry I should have posted that I am running 5.8.5 on Windows.  I am aware of the linux utility but I am on the wrong OS.

          • #86485
            David Barr
            Participant

              You can get a Windows version of Poppler (http://blog.alivate.com.au/poppler-windows/), or you can use the Poppler package from Cygwin, which is what I use on Windows.

          Viewing 2 reply threads
          • The forum ‘Cloverleaf’ is closed to new topics and replies.