Convert TIFF to PDF

This topic has 2 replies, 2 voices, and was last updated 2 years, 12 months ago by Dustin Sayes.

Creator

Topic
October 26, 2022 at 1:21 pm #120030
Dustin Sayes
Participant
The EMR vendor is sending a base64 TIFF in an ORU.

We need this TIFF image file to be text searchable.

Are there any options to convert a TIFF image to a text searchable format – PDF or HTML?

Thank you!
Creator

Topic

Viewing 1 reply thread

Author

Replies
- October 26, 2022 at 2:48 pm #120033
  Todd Hamilton
  Participant
  Dustin,
  
  I recommend using a commercial product to do image-to-OCR-to-PDF conversion. There is a reason there is a market for this kind of solution. It is very difficult to get it right. That being said, here is a link that will point you in the right direction for creating your own solution: Konrad Voelkel » Linux, OCR and PDF: Scan to PDF/A «
  
  Generally here is what you do:
  1. Extract the Base64 encoded string from OBX.5
  2. Decode the string to binary
  3. Validate the binary is a TIFF
  4. Create a PDF (to be used later)
  5. For each “page” of the TIFF do the following:
  6. Perform OCR on the TIFF page to capture the text. (Usually will not be accurate enough for clinical)
  7. Add a hidden PDF layer to the page that is the Text
  8. Add another PDF layer to the page which is the image
  Again I recommend looking for a commercial product to do this. Home – ARMA Buyer’s Guide (armabuyersguide.org)
  
  todd.hamilton.omaha@gmail.com
  (402) 660-2787
- October 31, 2022 at 2:19 pm #120036
  Dustin Sayes
  Participant
  Hey Todd, just wanted to thank you for taking the time to respond here. OCR was my missing link; I was using tiff2pdf, but that was simply creating a pdf file that was still an image, where the customer needs a text searchable file.
  
  I’ve not come to a complete solution here. My OCR results are not super awesome quality. Some letters are changed, and some symbols are missing. I’ve tried tesseract, gocr and cuneiform so far. I have not looked into a professional/for sale solution. My tiff source image is a graphical representation of data (printed anesthesia record). My concern is as you mentioned above “Usually will not be accurate enough for clinical”
  
  Anyway, just wanted to thank you for your instruction. This has been a great learning exercise for me.
Author

Replies

Viewing 1 reply thread

You must be logged in to reply to this topic.