how to sanitize MS Word HTML output?

JD Austin jd at twingeckos.com
Mon May 4 10:56:57 MST 2009


You can do a lot of it in word itself by saving as filtered html.
I've used open office to generate better html in the past.
NVU/komposer does a decent job cleaning up bad html.
--
JD Austin
Twin Geckos Technology Services LLC
jd at twingeckos.com
480.288.8195x201
http://www.twingeckos.com


George Burns <http://www.brainyquote.com/quotes/authors/g/george_burns.html>
- "Happiness is having a large, loving, caring, close-knit family in
another city."

On Mon, May 4, 2009 at 10:27 AM, Steven A. DuChene <
linux-clusters at mindspring.com> wrote:

> Hello all:
> My wife has a class sylibus file from one of her profs at MCC and the file
> is "supposed" to be html but it is that awful sort-of-html crap from
> MS-Office. It is filled with a lot of un-needed style and formating tags
> as well as all kinds of stupid extra characters due to some MS "standard"
> character formatting stuff. Things like braking lines in the middle of
> words and then adding an equal sign at the end of the broken line or
> replacing equal signs in the html code with "=3D'
>
> Does anyone know of a tool that will clean this crappy excuse for html
> code up into something more standard? Or failing that just some tool
> or script that will fix the weird character formating stuff with the
> extra equal signs or "=3D" problems???
> --
> Steve DuChene
>
>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.PLUG.phoenix.az.us/pipermail/plug-discuss/attachments/20090504/903888ea/attachment.htm 


More information about the PLUG-discuss mailing list