You can try this Toaster SPAM decoder (it will readily tell you if it's decodable as Decode quoted-printable encoded text): http://www.toastedspam.com/decodeqp It should be noted that quoted-printable encoded text is generally associated with EMAIL, not Word? On Mon, May 4, 2009 at 11:55 AM, Matt Graham wrote: > From: "Steven A. DuChene" > > It is filled with a lot of un-needed style and formating tags > > as well as all kinds of stupid extra characters due to some MS > > "standard" character formatting stuff. Things like braking lines > > in the middle of words and then adding an equal sign at the end > > of the broken line or replacing equal signs in the html code with > > "=3D' > > That's not HTML. That's quoted-printable encoding. The mail client > should've automatically converted that to UTF-8 or whatever when > it saved the file. If you have MIME::QuotedPrint installed, you > can decode that with a Perl one-liner and see if it looks any better. > > > Does anyone know of a tool that will clean this crappy excuse for > > html code up into something more standard? > > "Demoroniser" is probably not what you want. I've seen a few things > like that over the years, and have gotten rid of most of the junk > with a bunch of regular expressions. Without a look at what the > mangled HTML looks like, I couldn't give you a list of sed commands > to feed this data through. > > -- > Matt G / Dances With Crows > The Crow202 Blog: http://crow202.org/wordpress/ > There is no Darkness in Eternity/But only Light too dim for us to see > > > --------------------------------------------------- > PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us > To subscribe, unsubscribe, or to change your mail settings: > http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss > -- www.obnosis.com (503)754-4452 "Contradictions do not exist." A. Rand