Perhaps we have been lucky, but when we have encountered HTML or RTF from external systems, there has often been a configuration setting on those systems that will select plain text versus formatted as the output. Have you touched base with the system’s vendor to see if that is an option?
There are a number of conversion routines available, however we have not found any that deal well with formatted items within the document, like multi-column tables, headers, footers and other items requiring specific positioning on the page.
Unfortunately, I am stuck on a windows box. Considering this, I was able to obtain lynx for win32, looks like it was compiled back in 99. But it appears to work if the html is stored in a file and I call it from the command line.
So this question may be related back to David to find out if he might know of a way to send lynx an html string in an exec call from a tcl script?
Can I approach it through tcl this way or is there a more efficient way?:
1) extract html from hl7 message and write to a test.tmp file.
2) set html_to_txt_variable [exec lynx.exe -dump test.tmp]