On 9/28/24 21:06, Michael via PLUG-discuss wrote:
> About a year ago I messed up by accidently copying a folder with other
> folders into another folder. I'm running out of room and need to find that
> directory tree and get rid of it. All I know for certain is that it is
> somewhere in my home directory. I THINK it is my pictures directory with
> ARW files.
> chatgpt told me to use fdupes but it told me to use an exclude option
> (which I found out it doesn't have) to avoid config files (and I was
> planning on adding to that as I discovered other stuff I didn't want). then
> it told me to use find but I got an error which leads me to believe it
> doesn't know what it's talking about!
> coul;d someone help me out?
>
First, someone said you need to run updatedb before running find. No,
sorry, updatedb is for using locate, not find. Find actively walks the
directory tree. Locate searches the text (I think) database built by
updatedb.
Ok, now to answer the question. I've got a similar situation, but in
spades. Every time I did a backup, I did an entire copy of everything,
so I've got ... oh, 10, 20, 30 copies of many things. I'm working on
scripts to help reduce that, but for now doing it somewhat manually, I
suggest the following command:
cd (the directory of interest, possibly your home dir) ; find . -type f
-print0 | xargs -0 md5sum | sort > list.of.files
this will create a list of files, sorted by their md5sum. If you want
to be lazy and not search that file for duplicate md5sums, consider
uniq. Like this:
uniq -c -d -w list.of.files
This will print the list of files which are duplicates. For example,
out of a list of 42,279 files in a certain directory on my computer,
here's the result:
2 73d249df037f6e63022e5cfa8d0c959b
_files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160321-223138.png
5 9b162ac35214691461cc0f0104fb91ce
_files/melissa/Documents/EPHESUS/Office Stuff/SPD/SPD SUMMER 2016 (1).pdf
3 b396af67f2cd75658397efd878a01fb8
_files/dads_zipdisks/2003-1/CLASS at VBC Sp-03/CLASS BKUP - Music
Reading & Sight Singing Class/C & D Major & Minor Scales & Chords.mct
2 cd83094e0c4aeb9128806b5168444578
_files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160318-222051.png
2 d1a5a1bec046cc85a3a3fd53a8d5be86
_files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160410-145331.png
2 fa681c54a2bd7cfa590ddb8cf6ca1cea
_files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160312-113340.png
Originally the _files directory had MANY duplicates, now I've managed to
get that down to the above list...
Anyway, there you go. Happy scripting.
---------------------------------------------------
PLUG-discuss mailing list:
PLUG-discuss@lists.phxlinux.org
To subscribe, unsubscribe, or to change your mail settings:
https://lists.phxlinux.org/mailman/listinfo/plug-discuss