Has anyone parsed the contents of a PDF document using TCL?

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf Has anyone parsed the contents of a PDF document using TCL?

  • Creator
    Topic
  • #55778
    David Coffey
    Participant

    Actually I do know of a shop that has done it to extract PHI to build the HL7 on the fly, I am hoping they respond.  Anyone else?  How is this done?  

    David Coffey

Viewing 2 reply threads
  • Author
    Replies
    • #86483
      David Barr
      Participant

      Install poppler-utils on Redhat. This includes a utility called “pdftotext”. You can write your message out to a file, then you can exec the pdftotext utility from TCL and read the output back in. You’ll have to parse the results based on the format of the data in the PDF and convert it to HL7. It usually helps to run “pdftotext -layout”. This makes the output correspond to the order of items on the page rather than the order they appear in the PDF file (they can differ).

    • #86484
      David Coffey
      Participant

      I’m sorry I should have posted that I am running 5.8.5 on Windows.  I am aware of the linux utility but I am on the wrong OS.

    • #86485
      David Barr
      Participant

      You can get a Windows version of Poppler (http://blog.alivate.com.au/poppler-windows/), or you can use the Poppler package from Cygwin, which is what I use on Windows.

Viewing 2 reply threads
  • The forum ‘Cloverleaf’ is closed to new topics and replies.

Forum Statistics

Registered Users
5,117
Forums
28
Topics
9,292
Replies
34,435
Topic Tags
286
Empty Topic Tags
10