file size on disk vs used drastically different

Brian Cluff brian at snaptek.com
Thu Apr 15 14:50:43 MST 2010


On 04/15/2010 01:16 PM, Eric Shubert wrote:
> der.hans wrote:
>> Am 15. Apr, 2010 schwätzte Shawn Badger so:
>>
>>> I came across a weird problem this morning. What would cause a file
>>> to be
>>> reported as 251M for used space and 1.3G for size on disk?
>>>
>>> [root at cc1lnx5 axprac]# ls -sh cafrap_1.dbf; ls -lh cafrap_1.dbf
>>> *251M* cafrap_1.dbf
>>> -rw-r----- 1 oraxprac axprac *1.3G* Apr 15 09:47 cafrap_1.dbf
>>> [root at cc1lnx5 axprac]#
>>
>> A bunch of nulls.
>>
>> Essentially, the file has allocated 1.3GB of space, but since a bunch of
>> what it's storing are nulls the filesystem cheats and doesn't use space
>> for them.
>>
>> That space can expand out during backups and other operations, so be
>> careful copying it around.
>>
>>> I have seen this to a smaller extent with some files but never a
>>> variance of
>>> this size.
>>> This file happens to be an Oracle 11G database table file.
>>
>> I believe Oracle allots a configured amount of space for DB tables. The
>> space alloted but not used should be nulls.
>>
>> $ qemu-img create /tmp/Beispiel.img 10G
>> Formatting '/tmp/Beispiel.img', fmt=raw size=10737418240
>>
>> $ ls -sh /tmp/Beispiel.img ; ls -lh /tmp/Beispiel.img 0 /tmp/Beispiel.img
>> -rw-r--r-- 1 lufthans lufthans 10G 2010-04-15 11:06 /tmp/Beispiel.img
>>
>> It is to filesystem allocation what ticket overselling is to airlines :).
>>
>> ciao,
>>
>> der.hans
>>
>
> Interesting.
>
> I wonder if VMware thick virtual disks (vmdk files) exhibit this same
> kind of behavior. Anyone know?

Yup, you have just discovered a sparse file.  I used to use sparse file 
to great extent at the school, so that I could create a 10 gig vmware 
image that I could then install on any machine, no matter what size hard 
drive it had.
Believe it on not, I could get a Linux image as well as a 10gig windows 
vmware image onto an 8 gig hard drive and it would all work just fine. 
Of course I could only add a certain amount of data to the vmware image 
before the hard drive told it that it was full and the whole thing came 
crashing down, but the way I was doing things, that never happened.

Look into --sparse=WHEN cp option and the -S flag for tar.

Just a little note that has nothing to do with your question, because I 
feel like typeing a little more... when creating the vmware image I 
would do everything needed to the image until I had exactly what I 
wanted.  I would then use the "eraser" program to write zeros to every 
part of the virtual disk that wasn't in use.  I would then tell tar to 
honor sparsness of the file while I was creating an archive that I would 
later use to reimage other machines.
Then when I would restore the image onto a fresh machine it would be 
written back only using about 1.8 gig of actual space instead of the 10 
gig that the file said it is, an indeed it WAS on the original computer.

Brian Cluff


More information about the PLUG-discuss mailing list