On 7/18/06, Craig Brooksby wrote: > > First off, let me say "amen" [...] about Linux users > generally being kind and helpful to newbies. PLUG is an outstanding > example of that. :-) > > OK: I remember someone out there being involved in computer > forensics. I have to train a bunch of office people on the risks > inherent in sending out Word, Excel, and other MS-Office docs, which > contain metadata (deleted words and values, etc.), The recipient, > with a little work, can find out what the price was before you made > that final change, or who the customer was before you searched and > replaced with their name. > > If anyone out there can share, and if it's not too OT, please give me > tips on open-source tools one can use to explore / remove metadata > from docs. And yes, I prefer the OO suite, but these are lawers and > law offices, and Word reigns supreme there (and WordPerfect!). > > I'm Googling, and doing what I can, but I'm looking for the voice of > experience, if it is out there. > > Thanks -- > > Craig > --------------------------------------------------- > PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us > To subscribe, unsubscribe, or to change you mail settings: > http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss > This may not be exactly what you are looking for but, use your judgment This is a trick that I learned when I was sending a resume to Google and they said they would accept in in either plain text or html format. Due to reasons of what some "other" resume recipients "prefer" or find convenient, I had it as a text file and as a Ms-Word .doc file. I did a little bit of checking in to the features of MS-Word, for saving a .doc file in html format (as a .htm file) (or is it .html?) It turns out that, their "default" way of creating an html file, is a file that has a lot of extra stuff in it (probably ignored by most code that is expecting normal HTML), that I think is intended, to allow a lossless "round trip" if one ever wanted to then apply the inverse function, and get a *.doc file back, without losing anything. On the other hand, if one wants to be able to edit the html file with an editor that expects plain vanilla ASCII text, then the format without the extra stuff, would be better. I forget what they call it, something like "annotations". They have 2 HTML formats: html, and "html without annotations". And of course there is a dire warning if you ask for the latter, it checks to make sure you know what you are doing (know what you want to do) -- like when you have some file that "was" in so-called Rich text format, and you try to save it as a plain text file, it warns, that you will be "losing all formatting". So, I noticed that if I asked for the "html without annotations" format, the file was a lot smaller. I am not sure if saving in that format, and then using the resulting file to later convert "back" to a *.doc file, would get rid of all of the "security risk" info about non-latest versions -- but I think it would. To check, one could look at the (human readable HTML "text", in the) "html without annotations" file. Or to be really sure, onje could look it up, and/or ask someone who is an expert on MS-word. I know this may seem off-topic, (from PLUG and gnu/Linux) but practical knowledge always has a place, so it is reasonable to know something about it; and as someone said (I think on another branch of this thread), it is security related. and it is also related, to the topic of helping folks, who are new (beginners) and/or whose knowledge of technical stuff is less in some way. Sometimes these friends who need help, are really smart, and could handle the technical details, but are just not very confident of their ability to master it - and I don't blame them since sometimes it seems very confusing. -- Mike Schwartz Glendale AZ schwartz@acm.org Mike.L.Schwartz@gmail.com