finding duplicates?

Kevin Buettner plug-discuss@lists.plug.phoenix.az.us
Thu, 28 Feb 2002 13:06:27 -0700


On Feb 28, 12:39pm, Kevin Buettner wrote:

> > http://www.google.com/search?hl=en&q=linux+find+duplicate+files
> > http://www.perlmonks.org/index.pl?node_id=2712&lastnode_id=1747
> 
> The script below is similar to the solution on the perlmonks page,
> but is perhaps somewhat simpler:


On Feb 28, 12:41pm, David A. Sinck wrote:

> I think perhaps running md5sum on every file might be a bit of a CPU
> heater.  If I were inclined to be nice to the CPU, I'd check size then
> md5sum if the same...unless you're coffee's cold or you have cycles to
> burn.  :-)

Now that I look more closely at the script on the perlmonks page, I
see that that's what it's doing.  It collects the names of the files
of identical sizes in a hash and then runs the md5 algorithm on those...

Okay, so my version is simpler, but slower...

Kevin