Why is odt word count less than text?

Brian Cluff brian at snaptek.com
Tue Sep 10 14:29:31 MST 2013


Your original .txt file is just plain uncompressed text, but when you 
convert the file over to an opendocument format, it is changed into an 
XML format plus accompanying styles, images, embedded fonts, etc, etc. 
There are actually a lot of individual files that make of an 
opendocument file, we just don't see them because they are all zipped 
together.  The zipping provides a lot of compression to the file that 
the .txt files doesn't have, so even with the extra bloat that 
libreoffice adds to the files, it's still smaller than the original .txt 
file.

On a side nore, you can actually use the unzip command to pull apart an 
.odt file and play with it's contents.  You can also use zip to put them 
back together, but there is one particular file that needs to be 
included without any compression, but the name of that file isn't coming 
to mind at the moment.

Brian Cluff

On 09/10/2013 01:45 PM, joe at actionline.com wrote:
> Here is the dir list for the text before, the odt,
> and the text after exporting from the odt:
>
>     90601 Sep 10 00:51 soteria.txt -- original
>     45851 Sep 10 01:14 soteria.odt -- imported to libre
>     90503 Sep 10 01:17 soteria2.txt -- exported from libre
>
> I've uploaded each file here:
> http://www.upquick.com/temp/soteria.txt -- before
> http://www.upquick.com/temp/soteria.odt
> http://www.upquick.com/temp/soteria2.txt -- after
>
>
> ===== Dazed replied =====
>> Not the same results I get so I would have to ask how you
>> are doing it with examples and tell us if you are measuring
>> all with the same tool. For example, Here are the wc counts
>> for a text file I created, imported into Libre Office, then
>> saved as an odt and as a doc file and finally re-saved
>> as a txt file (the one with the 2 appended.
>>
>> larry at hammerhead:~/Documents/Misc$ wc Ed*
>>      8   306 19456 Edmund Prescott Thiel.doc
>>     80   324 17242 Edmund Prescott Thiel.odt
>>     24   168   974 Edmund Prescott Thiel.txt
>>     24   168   961 Edmund Prescott Thiel2.txt
>>    136   966 38633 total
>
> ===== Joe original asked =====
>>> Why is it that when I import a text file into libre office
>>> and then export the same text as an .odt document, the
>>> resulting document has a smaller word count and smaller
>>> character count than the original text file has?
>>>
>>> Then if I save the same .odt document as a .txt file, the
>>> resulting .txt file is bigger than the .odt file (actually
>>> almost the same size as the original text file).
>
>
>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.phxlinux.org
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.phxlinux.org/mailman/listinfo/plug-discuss
>



More information about the PLUG-discuss mailing list