Thanks for your help... I dont have access to the computer now but I'll let you know if it works out when I do. On Mon, Sep 30, 2024, 1:16 PM Rusty Carruth via PLUG-discuss < plug-discuss@lists.phxlinux.org> wrote: > Oops, you are correct, the uniq command should have -w 34 list.of.files > not -w list.of.files. Sorry! (here's what I'd typed and what I should > have cut/pasted: > > root@rusty-MS-7851:/backups1/backup_system_v2# uniq -c -d -w 34 > sorted.new_filesA.md5|less ; wc -l sorted.new_filesA.md5 > 42279 sorted.new_filesA.md5 > root@rusty-MS-7851:/backups1/backup_system_v2# uniq -c -d -w 34 > sorted.new_filesA.md5 > > sorry again!) > > > Also, if you want to get a list of files and their MD5 sums from 'higher > up' in the directory tree, just change the starting directory in your > find command to that higher up location. However, you might need to run > the entire find and md5sum sequence as root, if the directories (and > files) you care about don't have read permission for you. (so, to find > ALL files everywhere on your computer, change the ~ to /. You'll > certainly get lots of permission denied errors if you do that as > yourself and not root. But starting at / will traverse ALL directories > on your computer, including /dev, and others you probably don't care > about. There are some useful options to find (like, don't go to a > different filesystem) you might want to use, see man page for find to > find them ;-) > > On 9/30/24 07:05, Michael via PLUG-discuss wrote: > > thank you so much! After running it I find it only finds the duplicates > in > > ~. I need to find the duplicates across all the directories under home. > > after looking at the man file and searching for recu it seems it recurses > > by default unless I am reading it wrong. > > I tried the uniq command but: > > > > uniq -c -d -w list.of.files > > uniq: list.of.files: invalid number of bytes to compare > > > > isn't uniq used to find the differences between two files? I have > > a very rudimentary understanding of linux so I'm sure I'm wrong > > > > all the files in list.of.files are invisible files. (prefaced with a > > period)) > > and isn't there a way to sort things depending on their column (column1 > > md5sum, column2 file name) > > > > On Mon, Sep 30, 2024 at 2:56 AM Rusty Carruth via PLUG-discuss < > > plug-discuss@lists.phxlinux.org> wrote: > > > >> On 9/28/24 21:06, Michael via PLUG-discuss wrote: > >>> About a year ago I messed up by accidently copying a folder with other > >>> folders into another folder. I'm running out of room and need to find > >> that > >>> directory tree and get rid of it. All I know for certain is that it is > >>> somewhere in my home directory. I THINK it is my pictures directory > with > >>> ARW files. > >>> chatgpt told me to use fdupes but it told me to use an exclude option > >>> (which I found out it doesn't have) to avoid config files (and I was > >>> planning on adding to that as I discovered other stuff I didn't want). > >> then > >>> it told me to use find but I got an error which leads me to believe it > >>> doesn't know what it's talking about! > >>> coul;d someone help me out? > >>> > >> First, someone said you need to run updatedb before running find. No, > >> sorry, updatedb is for using locate, not find. Find actively walks the > >> directory tree. Locate searches the text (I think) database built by > >> updatedb. > >> > >> > >> Ok, now to answer the question. I've got a similar situation, but in > >> spades. Every time I did a backup, I did an entire copy of everything, > >> so I've got ... oh, 10, 20, 30 copies of many things. I'm working on > >> scripts to help reduce that, but for now doing it somewhat manually, I > >> suggest the following command: > >> > >> > >> cd (the directory of interest, possibly your home dir) ; find . -type f > >> -print0 | xargs -0 md5sum | sort > list.of.files > >> > >> this will create a list of files, sorted by their md5sum. If you want > >> to be lazy and not search that file for duplicate md5sums, consider > >> uniq. Like this: > >> > >> uniq -c -d -w list.of.files > >> > >> > >> This will print the list of files which are duplicates. For example, > >> out of a list of 42,279 files in a certain directory on my computer, > >> here's the result: > >> > >> 2 73d249df037f6e63022e5cfa8d0c959b > >> > >> > _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160321-223138.png > >> 5 9b162ac35214691461cc0f0104fb91ce > >> _files/melissa/Documents/EPHESUS/Office Stuff/SPD/SPD SUMMER 2016 > (1).pdf > >> 3 b396af67f2cd75658397efd878a01fb8 > >> _files/dads_zipdisks/2003-1/CLASS at VBC Sp-03/CLASS BKUP - Music > >> Reading & Sight Singing Class/C & D Major & Minor Scales & Chords.mct > >> 2 cd83094e0c4aeb9128806b5168444578 > >> > >> > _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160318-222051.png > >> 2 d1a5a1bec046cc85a3a3fd53a8d5be86 > >> > >> > _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160410-145331.png > >> 2 fa681c54a2bd7cfa590ddb8cf6ca1cea > >> > >> > _files/from_ebay_pc/pics_and_such_from_work/phone_backup/try2_nonptp_or_whatever/Pictures/Screenshots/Screenshot_20160312-113340.png > >> > >> Originally the _files directory had MANY duplicates, now I've managed to > >> get that down to the above list... > >> > >> Anyway, there you go. Happy scripting. > >> > >> --------------------------------------------------- > >> PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org > >> To subscribe, unsubscribe, or to change your mail settings: > >> https://lists.phxlinux.org/mailman/listinfo/plug-discuss > >> > > > > > > --------------------------------------------------- > > PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org > > To subscribe, unsubscribe, or to change your mail settings: > > https://lists.phxlinux.org/mailman/listinfo/plug-discuss > --------------------------------------------------- > PLUG-discuss mailing list: PLUG-discuss@lists.phxlinux.org > To subscribe, unsubscribe, or to change your mail settings: > https://lists.phxlinux.org/mailman/listinfo/plug-discuss >