Got a text formatting/database question ("bash" it to hell?)
Jim March
1.jim.march at gmail.com
Tue Apr 14 18:34:30 MST 2009
Guys,
I have an interesting database problem that I think can be solved on
the command line in one shot. But I don't know how :(.
I have a comma separated values text file. Each line shows a voter ID
number and an election ID number they voted in. NOT who they voted
for, and not their names, just that they voted in that election (cast
a ballot at all, even if blank).
There are multiple elections a given voter likely voted for. So
here's the section for two voter IDs (first column) and the elections
they voted in (second column) plus the method used to vote (third
column) if it was early or mail-in (which I can ignore). In pasting
it to EMail (from Openoffice spreadsheet used as a quick viewer)
they're separated by spaces but in the original data it's commas.
---
233 2
233 3
233 4
233 5
233 6
233 7
233 31
233 32
233 38
233 41
233 45
233 55
233 57
233 95
233 96
235 2
235 3
235 4
235 5
235 6
235 7
235 31 Early Ballot
235 32 Early Ballot
235 38
235 45
235 55
235 57 Early Ballot
235 95 Early Ballot
235 96 Early Ballot
235 125
235 126 Early Ballot
235 143
235 147 Early Ballot
235 148 Early Ballot
235 170 Early Ballot
---
So what I want to do is, strip out every line that does NOT have a
"170" in the second column, and then produce a line count. I need to
know (like ASAP) how many people voted in election 170 as that's the
2006 RTA special election in Pima County now subject to a recount.
And then I can do a second pass using the same technique and find out
how many people filed an early ballot by stripping out those and
counting lines again (and doing basic subtraction).
Help? This is about a criminal ivestigation going on right now
regarding this election...
Thanks!
Jim March
More information about the PLUG-discuss
mailing list