Why is odt word count less than text?
Brian Cluff
brian at snaptek.com
Tue Sep 10 14:29:31 MST 2013
Your original .txt file is just plain uncompressed text, but when you
convert the file over to an opendocument format, it is changed into an
XML format plus accompanying styles, images, embedded fonts, etc, etc.
There are actually a lot of individual files that make of an
opendocument file, we just don't see them because they are all zipped
together. The zipping provides a lot of compression to the file that
the .txt files doesn't have, so even with the extra bloat that
libreoffice adds to the files, it's still smaller than the original .txt
file.
On a side nore, you can actually use the unzip command to pull apart an
.odt file and play with it's contents. You can also use zip to put them
back together, but there is one particular file that needs to be
included without any compression, but the name of that file isn't coming
to mind at the moment.
Brian Cluff
On 09/10/2013 01:45 PM, joe at actionline.com wrote:
> Here is the dir list for the text before, the odt,
> and the text after exporting from the odt:
>
> 90601 Sep 10 00:51 soteria.txt -- original
> 45851 Sep 10 01:14 soteria.odt -- imported to libre
> 90503 Sep 10 01:17 soteria2.txt -- exported from libre
>
> I've uploaded each file here:
> http://www.upquick.com/temp/soteria.txt -- before
> http://www.upquick.com/temp/soteria.odt
> http://www.upquick.com/temp/soteria2.txt -- after
>
>
> ===== Dazed replied =====
>> Not the same results I get so I would have to ask how you
>> are doing it with examples and tell us if you are measuring
>> all with the same tool. For example, Here are the wc counts
>> for a text file I created, imported into Libre Office, then
>> saved as an odt and as a doc file and finally re-saved
>> as a txt file (the one with the 2 appended.
>>
>> larry at hammerhead:~/Documents/Misc$ wc Ed*
>> 8 306 19456 Edmund Prescott Thiel.doc
>> 80 324 17242 Edmund Prescott Thiel.odt
>> 24 168 974 Edmund Prescott Thiel.txt
>> 24 168 961 Edmund Prescott Thiel2.txt
>> 136 966 38633 total
>
> ===== Joe original asked =====
>>> Why is it that when I import a text file into libre office
>>> and then export the same text as an .odt document, the
>>> resulting document has a smaller word count and smaller
>>> character count than the original text file has?
>>>
>>> Then if I save the same .odt document as a .txt file, the
>>> resulting .txt file is bigger than the .odt file (actually
>>> almost the same size as the original text file).
>
>
>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.phxlinux.org
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.phxlinux.org/mailman/listinfo/plug-discuss
>
More information about the PLUG-discuss
mailing list