Re: How to locate all duplicate files?

Top Page
Attachments:
Message as email
+ (text/plain)
+ (text/html)
+ (text/plain)
Delete this message
Reply to this message
Author: Kevin Fries
Date:  
To: Main PLUG discussion list
Subject: Re: How to locate all duplicate files?
OK, you have several questions...

- First a simple script to find all duplicate filenames.
problem is you need to get a list of all files on your system, then compare
the names, minus the path. So I would try something like this (not fully
tested):

#/bin/bash

find -P / -type f > /tmp/files.txt
sed -i -e 's#.*/\(.*\)$#\1#' /tmp/files.txt
sort /tmp/files{,1}.txt
rm files.txt
uniq -D /tmp/files{1,}
rm files1.txt

My logic:
First get a list of all files ignoring symlinks (which are duplicate by
definition) looking at only regular files.
Next strip the path from the names in the temp file
Now that you only have filenames, sort the list into a temp file
Delete the original file
Now, seek all duplicates, and place those names back into the original
file
Delete the second temp file

Now you should have a list of all dup filenames

- How can I tell if they are just duplicate filenames, or if they are
actually duplicate files?
for each filename, find all copies of the files with the find command, and
run them through sha1sum like so:

for x in $(find /tmp -name <filename to check>); do sha1sum $x; done

files with the same sha1sum, should have duplicate contents.

You may need to check my syntax on some of this, but it should get the job
done.

Kevin Fries
On Wed, Apr 21, 2010 at 1:53 PM, <> wrote:

>
> What command syntax can I use to locate all duplicate files (filenames) on
> my system? Or, more specifically, within any specified directory on the
> system?
>
> Also, how can I tell which duplicates have identical contents and which
> duplicates have different content (or at least different file sizes)?
>
>
>
> ---------------------------------------------------
> PLUG-discuss mailing list -
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>

---------------------------------------------------
PLUG-discuss mailing list -
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss