Got a text formatting/database question ("bash" it to hell?)

Eric Shubert ejs at shubes.net
Tue Apr 14 20:02:37 MST 2009


Jim March wrote:
> Guys,
> 
> I have an interesting database problem that I think can be solved on
> the command line in one shot.  But I don't know how :(.
> 
> I have a comma separated values text file.  Each line shows a voter ID
> number and an election ID number they voted in.  NOT who they voted
> for, and not their names, just that they voted in that election (cast
> a ballot at all, even if blank).
> 
> There are multiple elections a given voter likely voted for.  So
> here's the section for two voter IDs (first column) and the elections
> they voted in (second column) plus the method used to vote (third
> column) if it was early or mail-in (which I can ignore).  In pasting
> it to EMail (from Openoffice spreadsheet used as a quick viewer)
> they're separated by spaces but in the original data it's commas.
> 
> ---
> 233	2	
> 233	3	
> 233	4	
> 233	5	
> 233	6	
> 233	7	
> 233	31	
> 233	32	
> 233	38	
> 233	41	
> 233	45	
> 233	55	
> 233	57	
> 233	95	
> 233	96	
> 235	2	
> 235	3	
> 235	4	
> 235	5	
> 235	6	
> 235	7	
> 235	31	Early Ballot
> 235	32	Early Ballot
> 235	38	
> 235	45	
> 235	55	
> 235	57	Early Ballot
> 235	95	Early Ballot
> 235	96	Early Ballot
> 235	125	
> 235	126	Early Ballot
> 235	143	
> 235	147	Early Ballot
> 235	148	Early Ballot
> 235	170	Early Ballot
> ---
> 
> So what I want to do is, strip out every line that does NOT have a
> "170" in the second column, and then produce a line count.  I need to
> know (like ASAP) how many people voted in election 170 as that's the
> 2006 RTA special election in Pima County now subject to a recount.
> And then I can do a second pass using the same technique and find out
> how many people filed an early ballot by stripping out those and
> counting lines again (and doing basic subtraction).
> 
> Help?  This is about a criminal ivestigation going on right now
> regarding this election...
> 
> Thanks!
> 
> Jim March

$ grep ".*,170,.*" input.txt | tee hits.txt | wc

-- 
-Eric 'shubes'



More information about the PLUG-discuss mailing list