thank you so much! After running it I find it only finds the duplicates in ~. I need to find the duplicates across all the directories under home. after looking at the man file and searching for recu it seems it recurses by default unless I am reading it wrong. I tried the uniq command but: uniq -c -d -w list.of.files uniq: list.of.files: invalid number of bytes to compare isn't uniq used to find the differences between two files? I have a very rudimentary understanding of linux so I'm sure I'm wrong all the files in list.of.files are invisible files. (prefaced with a period)) and isn't there a way to sort things depending on their column (column1 md5sum, column2 file name) On Mon, Sep 30, 2024 at 2:56 AM Rusty Carruth via PLUG-discuss < plug-discuss@lists.phxlinux.org> wrote: > > On 9/28/24 21:06, Michael via PLUG-discuss wrote: > > About a year ago I messed up by accidently copying a folder with other > > folders into another folder. I'm running out of room and need to find > that > > directory tree and get rid of it. All I know for certain is that it is > > somewhere in my home directory. I THINK it is my pictures directory with > > ARW files. > > chatgpt told me to use fdupes but it told me to use an exclude option > > (which I found out it doesn't have) to avoid config files (and I was > > planning on adding to that as I discovered other stuff I didn't want). > then > > it told me to use find but I got an error which leads me to believe it > > doesn't know what it's talking about! > > coul;d someone help me out? > > > First, someone said you need to run updatedb before running find. No, > sorry, updatedb is for using locate, not find. Find actively walks the > directory tree. Locate searches the text (I think) database built by > updatedb. > > > Ok, now to answer the question. I've got a similar situation, but in > spades. Every time I did a backup, I did an entire copy of everything, > so I've got ... oh, 10, 20, 30 copies of many things. I'm working on > scripts to help reduce that, but for now doing it somewhat manually, I > suggest the following command: > > > cd (the directory of interest, possibly your home dir) ; find . -type f > -print0 | xargs -0 md5sum | sort > list.of.files > > this will create a list of files, sorted by their md5sum. If you want > to be lazy and not search that file for duplicate md5sums, consider > uniq. Like this: > > uniq -c -d -w list.of.files > > > This will print the list of files which are duplicates. For example, > out of a list of 42,279 files in a certain directory on my computer, > here's the result: > > 2 73d249df037f6e63022e5cfa8d0c959b > > _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160321-223138.png > 5 9b162ac35214691461cc0f0104fb91ce > _files/melissa/Documents/EPHESUS/Office Stuff/SPD/SPD SUMMER 2016 (1).pdf > 3 b396af67f2cd75658397efd878a01fb8 > _files/dads_zipdisks/2003-1/CLASS at VBC Sp-03/CLASS BKUP - Music > Reading & Sight Singing Class/C & D Major & Minor Scales & Chords.mct > 2 cd83094e0c4aeb9128806b5168444578 > > _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160318-222051.png > 2 d1a5a1bec046cc85a3a3fd53a8d5be86 > > _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160410-145331.png > 2 fa681c54a2bd7cfa590ddb8cf6ca1cea > > _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160312-113340.png > > Originally the _files directory had MANY duplicates, now I've managed to > get that down to the above list... > > Anyway, there you go. Happy scripting. > > --------------------------------------------------- > PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org > To subscribe, unsubscribe, or to change your mail settings: > https://lists.phxlinux.org/mailman/listinfo/plug-discuss > -- :-)~MIKE~(-: