OK, a REAL Linux question... ; -) Ineeda"one-liner" (because I am lazy)

Bob Elzer bob.elzer at gmail.com
Fri Feb 27 10:27:16 MST 2009


Well it sounds like from your first post this was a one time deal.

>>I want to find the file that resembles most closely the one I have at
hand.

"most closely" implies, I don't know what it looks like.

One Liner means just that.

Is this an on going process, do you have a starting file to work with ? Your
first message said you needed to find two near similar files somewhere in a
directory tree.

I'm not picking on you, but I used to tell my users, If you ask Santa for a
toy soldier and you get a green plastic army man, don't be upset because you
really wanted a G.I. Joe doll.

Is what changes in the file the same line each time ?

Can you give us a real example of one of these ?

 

-----Original Message-----
From: plug-discuss-bounces at lists.plug.phoenix.az.us
[mailto:plug-discuss-bounces at lists.plug.phoenix.az.us] On Behalf Of
kitepilot at kitepilot.com
Sent: Friday, February 27, 2009 7:22 AM
To: Main PLUG discussion list
Subject: Re: OK, a REAL Linux question... ; -) Ineeda"one-liner" (because I
am lazy)

Thanks.
Time won't work.
These files are coming from a repository and they all have the same
date/time. 

>> There is no command to find something, if you don't know what you 
>> want to find.
I know what I want to find.
I want to find the file that resembles most closely the one I have at hand.
It's called a "Fuzzy" search. 

One approach would be to fire up a loop to compare every file to another one
ignoring white-spaces, log the resulted diff files, choose the smallest
results at the end of the run (after you define "smallest") and then use
some sort of "Fuzzy algorithm" to pick the finalists.
The final decision is hand picked.
Far from a "one-liner"...   :)
Thanks!   :)
ET 

 


Bob Elzer writes: 

> ls -aCltR
> 
> will list all the files in the current directory and below. 
> 
> each directory will be listed sorted by the date files were modified, 
> most recent first.
> 
> There are flags for the time format, but the most recent changes will 
> be at the top of each directory.
> 
> You will have to some work yourself, but this should narrow it down. 
> 
> There is no command to find something, if you don't know what you want 
> to find.
> 
> Although, the find command, can find files modified at certain times, 
> if you know about when the file changed.
> 
> 
> -----Original Message-----
> From: plug-discuss-bounces at lists.plug.phoenix.az.us
> [mailto:plug-discuss-bounces at lists.plug.phoenix.az.us] On Behalf Of 
> kitepilot at kitepilot.com
> Sent: Thursday, February 26, 2009 4:58 PM
> To: Main PLUG discussion list
> Subject: Re: OK, a REAL Linux question... ; -) I needa"one-liner" 
> (because I am lazy)
> 
>>> *diff | wc -l* for each combination of file?
>>> have you tried ls -t, to see when the files were modified ?
> There are several hundreds of files in a 10-15 depth tree.
> That means that "ls -t" won't work, and firing a loop to diff each 
> one, to every other, will yield so many false positives that the 
> result (if found) will be lost in the noise.
> 
> It has to be some sort of "fuzzy" diff.
> I used to use a program called Uniquefiler that did that for pictures.  
> Sometimes it would come up with some very creative matching, but in 
> general it was an excellent program.
> I don't it need now, but I'd certainly like to know if someone knows 
> of a Linux variant.
> Thanks!   :)
> ET
> 
> 
> Eric Cope writes:  
> 
>> *diff | wc -l* for each combination of file?  
>> 
>> On Thu, Feb 26, 2009 at 3:12 PM, Bob Elzer <bob.elzer at gmail.com> wrote:  
>> 
>>> No you don't qualify, this is the Phoenix List.   Just kidding.  
>>>
>>> have you tried ls -t, to see when the files were modified ?  
>>>
>>>
>>> -----Original Message-----
>>> From: plug-discuss-bounces at lists.plug.phoenix.az.us
>>> [mailto:plug-discuss-bounces at lists.plug.phoenix.az.us] On Behalf Of 
>>> kitepilot at kitepilot.com
>>> Sent: Thursday, February 26, 2009 2:25 PM
>>> To: Main PLUG discussion list
>>> Subject: OK, a REAL Linux question... ;-) I need a"one-liner" 
>>> (because I am
>>> lazy)
>>>
>>> I have a bunch of text files.
>>> Makefile(s), that is.  
>>>
>>> I know that one of them (THERE ARE TONS!) was slightly modified.
>>> Names are meaningless, so it won't work.
>>> There are more changes that mere whitespaces, so diff -w ... won't 
>>> work either.
>>>
>>> Question is:
>>> How do I find 2 files that are "almost" the same file?  
>>>
>>> I have thought of different approaches, but none of then are one-liners.
>>> Is there a one-liner for this?
>>> Thanks!
>>> Enrique
>>>
>>> PS: I live in North West GA, play the worker in South Florida, drive 
>>> like a mailman and consider "the neighborhood" anything within 200
miles.
>>> Do I qualify as member of this list?   ;-)
>>> ---------------------------------------------------
>>> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
>>> To subscribe, unsubscribe, or to change your mail settings:
>>> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>>>
>>> ---------------------------------------------------
>>> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
>>> To subscribe, unsubscribe, or to change your mail settings:
>>> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>>>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
> 
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss



More information about the PLUG-discuss mailing list