OK, a REAL Linux question... ; -) Ineeda"one-liner" (because I am lazy)

kitepilot at kitepilot.com kitepilot at kitepilot.com
Fri Feb 27 12:37:56 MST 2009


>> Well it sounds like from your first post this was a one time deal.
Right now it is, but I could probably find a use for it in the future... 

>> "most closely" implies, I don't know what it looks like.
Which is correct.
I know what I want.
I don't know what it looks like.
I know something that looks pretty similar though.
A dotted army soldier?  :) 


For a example, say that you have a bunch of make files.
Somedy added one include path to one of then.
I have 2 directory trees, names are  meaningless (for the most part), and 
dates are all the same because someone ran 'touch *' 

I know that many files had many changes, but the two I'm looking for, had 
minor changes between them. 

The question is, which files most closely resemble each other?
It's a very fuzzy question, but valid, and to put into your context, it 
would be along the lines of:
In this box of soldiers that the dog chew away, which 2 of them are missing 
half an arm in opposite sides.
Clear as mud?
Didn't expect any better...   ;-)
Thanks!
Enrique 


Bob Elzer writes: 

> Well it sounds like from your first post this was a one time deal. 
> 
>>>I want to find the file that resembles most closely the one I have at
> hand. 
> 
> "most closely" implies, I don't know what it looks like. 
> 
> One Liner means just that. 
> 
> Is this an on going process, do you have a starting file to work with ? Your
> first message said you needed to find two near similar files somewhere in a
> directory tree. 
> 
> I'm not picking on you, but I used to tell my users, If you ask Santa for a
> toy soldier and you get a green plastic army man, don't be upset because you
> really wanted a G.I. Joe doll. 
> 
> Is what changes in the file the same line each time ? 
> 
> Can you give us a real example of one of these ? 
> 
>   
> 
> -----Original Message-----
> From: plug-discuss-bounces at lists.plug.phoenix.az.us
> [mailto:plug-discuss-bounces at lists.plug.phoenix.az.us] On Behalf Of
> kitepilot at kitepilot.com
> Sent: Friday, February 27, 2009 7:22 AM
> To: Main PLUG discussion list
> Subject: Re: OK, a REAL Linux question... ; -) Ineeda"one-liner" (because I
> am lazy) 
> 
> Thanks.
> Time won't work.
> These files are coming from a repository and they all have the same
> date/time.  
> 
>>> There is no command to find something, if you don't know what you 
>>> want to find.
> I know what I want to find.
> I want to find the file that resembles most closely the one I have at hand.
> It's called a "Fuzzy" search.  
> 
> One approach would be to fire up a loop to compare every file to another one
> ignoring white-spaces, log the resulted diff files, choose the smallest
> results at the end of the run (after you define "smallest") and then use
> some sort of "Fuzzy algorithm" to pick the finalists.
> The final decision is hand picked.
> Far from a "one-liner"...   :)
> Thanks!   :)
> ET  
> 
>   
> 
> 
> Bob Elzer writes:  
> 
>> ls -aCltR 
>> 
>> will list all the files in the current directory and below.  
>> 
>> each directory will be listed sorted by the date files were modified, 
>> most recent first. 
>> 
>> There are flags for the time format, but the most recent changes will 
>> be at the top of each directory. 
>> 
>> You will have to some work yourself, but this should narrow it down.  
>> 
>> There is no command to find something, if you don't know what you want 
>> to find. 
>> 
>> Although, the find command, can find files modified at certain times, 
>> if you know about when the file changed. 
>> 
>> 
>> -----Original Message-----
>> From: plug-discuss-bounces at lists.plug.phoenix.az.us
>> [mailto:plug-discuss-bounces at lists.plug.phoenix.az.us] On Behalf Of 
>> kitepilot at kitepilot.com
>> Sent: Thursday, February 26, 2009 4:58 PM
>> To: Main PLUG discussion list
>> Subject: Re: OK, a REAL Linux question... ; -) I needa"one-liner" 
>> (because I am lazy) 
>> 
>>>> *diff | wc -l* for each combination of file?
>>>> have you tried ls -t, to see when the files were modified ?
>> There are several hundreds of files in a 10-15 depth tree.
>> That means that "ls -t" won't work, and firing a loop to diff each 
>> one, to every other, will yield so many false positives that the 
>> result (if found) will be lost in the noise. 
>> 
>> It has to be some sort of "fuzzy" diff.
>> I used to use a program called Uniquefiler that did that for pictures.  
>> Sometimes it would come up with some very creative matching, but in 
>> general it was an excellent program.
>> I don't it need now, but I'd certainly like to know if someone knows 
>> of a Linux variant.
>> Thanks!   :)
>> ET 
>> 
>> 
>> Eric Cope writes:   
>> 
>>> *diff | wc -l* for each combination of file?   
>>> 
>>> On Thu, Feb 26, 2009 at 3:12 PM, Bob Elzer <bob.elzer at gmail.com> wrote:   
>>> 
>>>> No you don't qualify, this is the Phoenix List.   Just kidding.   
>>>>
>>>> have you tried ls -t, to see when the files were modified ?   
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: plug-discuss-bounces at lists.plug.phoenix.az.us
>>>> [mailto:plug-discuss-bounces at lists.plug.phoenix.az.us] On Behalf Of 
>>>> kitepilot at kitepilot.com
>>>> Sent: Thursday, February 26, 2009 2:25 PM
>>>> To: Main PLUG discussion list
>>>> Subject: OK, a REAL Linux question... ;-) I need a"one-liner" 
>>>> (because I am
>>>> lazy) 
>>>>
>>>> I have a bunch of text files.
>>>> Makefile(s), that is.   
>>>>
>>>> I know that one of them (THERE ARE TONS!) was slightly modified.
>>>> Names are meaningless, so it won't work.
>>>> There are more changes that mere whitespaces, so diff -w ... won't 
>>>> work either. 
>>>>
>>>> Question is:
>>>> How do I find 2 files that are "almost" the same file?   
>>>>
>>>> I have thought of different approaches, but none of then are one-liners.
>>>> Is there a one-liner for this?
>>>> Thanks!
>>>> Enrique 
>>>>
>>>> PS: I live in North West GA, play the worker in South Florida, drive 
>>>> like a mailman and consider "the neighborhood" anything within 200
> miles.
>>>> Do I qualify as member of this list?   ;-)
>>>> ---------------------------------------------------
>>>> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
>>>> To subscribe, unsubscribe, or to change your mail settings:
>>>> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss 
>>>>
>>>> ---------------------------------------------------
>>>> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
>>>> To subscribe, unsubscribe, or to change your mail settings:
>>>> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss 
>>>>
>> ---------------------------------------------------
>> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
>> To subscribe, unsubscribe, or to change your mail settings:
>> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss 
>> 
>> ---------------------------------------------------
>> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
>> To subscribe, unsubscribe, or to change your mail settings:
>> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss 
> 
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss


More information about the PLUG-discuss mailing list