how to sanitize MS Word HTML output?

Lisa Kachold lisakachold at obnosis.com
Mon May 4 11:44:01 MST 2009


Load into OpenOffice, display source, remove formatting, highlight,
cut/splice into text file, or save as text, rename to HTML?

Should work?

There are also online document conversions, I think MS HTML to text is one
of them?

On Mon, May 4, 2009 at 10:27 AM, Steven A. DuChene <
linux-clusters at mindspring.com> wrote:

> Hello all:
> My wife has a class sylibus file from one of her profs at MCC and the file
> is "supposed" to be html but it is that awful sort-of-html crap from
> MS-Office. It is filled with a lot of un-needed style and formating tags
> as well as all kinds of stupid extra characters due to some MS "standard"
> character formatting stuff. Things like braking lines in the middle of
> words and then adding an equal sign at the end of the broken line or
> replacing equal signs in the html code with "=3D'
>
> Does anyone know of a tool that will clean this crappy excuse for html
> code up into something more standard? Or failing that just some tool
> or script that will fix the weird character formating stuff with the
> extra equal signs or "=3D" problems???
> --
> Steve DuChene
>
>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>



-- 
www.obnosis.com (503)754-4452
"Contradictions do not exist." A. Rand
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.PLUG.phoenix.az.us/pipermail/plug-discuss/attachments/20090504/5be85ec3/attachment.htm 


More information about the PLUG-discuss mailing list