On 7/18/06, Craig Brooksby <rcbrxb@gmail.com> wrote:
>
> First off, let me say "amen"  [...]  about Linux users
> generally being kind and helpful to newbies.  PLUG is an outstanding
> example of that.  :-)
>
> OK:  I remember someone out there being involved in computer
> forensics.  I have to train a bunch of office people on the risks
> inherent in sending out Word, Excel, and other MS-Office docs, which
> contain metadata (deleted words and values, etc.),  The recipient,
> with a little work, can find out what the price was before you made
> that final change, or who the customer was before you searched and
> replaced with their name.
>
> If anyone out there can share, and if it's not too OT, please give me
> tips on open-source tools one can use to explore / remove metadata
> from docs.  And yes, I prefer the OO suite, but these are lawers and
> law offices, and Word reigns supreme there (and WordPerfect!).
>
> I'm Googling, and doing what I can, but I'm looking for the voice of
> experience, if it is out there.
>
> Thanks --
>
> Craig
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change  you mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>

This may not be exactly what you are looking for
but, use your judgment
This is a trick that I learned when I was sending a resume to Google
and they said they would accept in in either plain text or html format.
Due to reasons of what some "other" resume recipients "prefer" or
find convenient, I had it as a text file and as a Ms-Word .doc file.
I did a little bit of checking in to the features of MS-Word, for saving
a .doc file in html format (as a .htm file) (or is it .html?)
It turns out that, their "default" way of creating an html file, is a file
that has a lot of extra stuff in it (probably ignored by most code
that is expecting normal HTML), that I think is intended, to allow
a lossless "round trip" if one ever wanted to then apply the inverse
function, and get a *.doc file back, without losing anything.
   On the other hand, if one wants to be able to edit the html file
with an editor  that expects plain vanilla ASCII text, then the format
without the extra stuff, would be better.  I forget what they call it,
something like "annotations".  They have 2 HTML formats:  html,
and "html without annotations".  And of course there is a dire warning
if you ask for the latter, it checks to make sure you know what you
are doing (know what you want to do) -- like when you have some
file that "was" in so-called Rich text format, and you try to save it
as a plain text file, it warns, that you will be "losing all formatting".
    So, I noticed that if I asked for the "html without annotations"
format, the file was a lot smaller.  I am not sure if saving in that
format, and then using the resulting file to later convert "back"
to a *.doc file, would get rid of all of the "security risk" info about
non-latest versions -- but I think it would.  To check, one could look
at the (human readable HTML "text", in the) "html without annotations"
file.  Or to be really sure, onje could look it up, and/or ask someone
who is an expert on MS-word.
    I know this may seem off-topic, (from PLUG and gnu/Linux) but
practical knowledge always has a place, so it is reasonable to
know something about it;  and as someone said (I think on another
branch of this thread), it is security related.
     and it is also related, to the topic of helping folks, who are new
(beginners) and/or whose knowledge of technical stuff is less in
some way.  Sometimes these friends who need help, are really
smart, and could handle the technical details, but are just not very
confident of their ability to master it - and I don't blame them
since sometimes it seems very confusing.
-- 
Mike Schwartz
Glendale  AZ
schwartz@acm.org
Mike.L.Schwartz@gmail.com