Re: Python help (finding duplicates)

Top Page
Attachments:
Message as email
+ (text/plain)
+ signature.asc (application/pgp-signature)
+ (text/plain)
Delete this message
Reply to this message
Author: Joseph Sinclair
Date:  
To: kondor6c, Main PLUG discussion list
Subject: Re: Python help (finding duplicates)
I hope these are small files, the algorithm you wrote is not going to run well as file size gets large (over 10,000 entries)
Have you checked the space/tab situation? Python uses indentation changes to indicate the end of a block, so inconsistent use of tabs and spaces freaks it out.
Here are a couple questions:
Are these always numbers?
Do the files have to remain in their original order, or can you reorder them during processing?
How often does this have to run?
Do you have to "comment" the duplicate, or can you remove it?
Are there any other requirements not obvious from the description below?

Kevin Faulkner wrote:
> I was trying to pull duplicates out of 2 different files. Needless to say there 
> are duplicates I would place a # next to the duplicate. Example files:
> file 1:    file 2:
> 433.3    947.3
> 543.1    749.0
> 741.1    859.2
> 238.5    433.3
> 839.2    229.1
> 583.6    990.1
> 863.4    741.1
> 859.2    101.8

>
> import string
> i=1
> primaryfile = open('/tmp/extract','r')
> secondaryfile = open('/tmp/unload')
> for line in primaryfile:
>    pcompare = line
>    print(pcompare)
>    for row in secondaryfile:
>      i = i + 1
>      print(i)
>      scompare = row
>      if pcompare == scompare:
>        print(scompare)
>        secondaryfile.write('#')
> With this code it should go through the files and find a duplicate and place a 
> '#' next to it. But for some reasonson it doesn't even get to the second for 
> statement. I don't know what else to do. Please offer some assistance. :)
> ---------------------------------------------------
> PLUG-discuss mailing list - 
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

>


---------------------------------------------------
PLUG-discuss mailing list -
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss