Re: Why is odt word count less than text?

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Brian Cluff
Date:  
To: Main PLUG discussion list
Subject: Re: Why is odt word count less than text?
Your original .txt file is just plain uncompressed text, but when you
convert the file over to an opendocument format, it is changed into an
XML format plus accompanying styles, images, embedded fonts, etc, etc.
There are actually a lot of individual files that make of an
opendocument file, we just don't see them because they are all zipped
together. The zipping provides a lot of compression to the file that
the .txt files doesn't have, so even with the extra bloat that
libreoffice adds to the files, it's still smaller than the original .txt
file.

On a side nore, you can actually use the unzip command to pull apart an
.odt file and play with it's contents. You can also use zip to put them
back together, but there is one particular file that needs to be
included without any compression, but the name of that file isn't coming
to mind at the moment.

Brian Cluff

On 09/10/2013 01:45 PM, wrote:
> Here is the dir list for the text before, the odt,
> and the text after exporting from the odt:
>
>     90601 Sep 10 00:51 soteria.txt -- original
>     45851 Sep 10 01:14 soteria.odt -- imported to libre
>     90503 Sep 10 01:17 soteria2.txt -- exported from libre

>
> I've uploaded each file here:
> http://www.upquick.com/temp/soteria.txt -- before
> http://www.upquick.com/temp/soteria.odt
> http://www.upquick.com/temp/soteria2.txt -- after
>
>
> ===== Dazed replied =====
>> Not the same results I get so I would have to ask how you
>> are doing it with examples and tell us if you are measuring
>> all with the same tool. For example, Here are the wc counts
>> for a text file I created, imported into Libre Office, then
>> saved as an odt and as a doc file and finally re-saved
>> as a txt file (the one with the 2 appended.
>>
>> larry@hammerhead:~/Documents/Misc$ wc Ed*
>>      8   306 19456 Edmund Prescott Thiel.doc
>>     80   324 17242 Edmund Prescott Thiel.odt
>>     24   168   974 Edmund Prescott Thiel.txt
>>     24   168   961 Edmund Prescott Thiel2.txt
>>    136   966 38633 total

>
> ===== Joe original asked =====
>>> Why is it that when I import a text file into libre office
>>> and then export the same text as an .odt document, the
>>> resulting document has a smaller word count and smaller
>>> character count than the original text file has?
>>>
>>> Then if I save the same .odt document as a .txt file, the
>>> resulting .txt file is bigger than the .odt file (actually
>>> almost the same size as the original text file).
>
>
>
> ---------------------------------------------------
> PLUG-discuss mailing list -
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.phxlinux.org/mailman/listinfo/plug-discuss
>


---------------------------------------------------
PLUG-discuss mailing list -
To subscribe, unsubscribe, or to change your mail settings:
http://lists.phxlinux.org/mailman/listinfo/plug-discuss