regexp problem
David A. Sinck
plug-discuss@lists.plug.phoenix.az.us
Wed, 11 Dec 2002 12:57:06 -0700
\_ SMTP quoth Lynn David Newton on 12/11/2002 12:30 as having spake thusly:
\_
\_
\_ Greetings,
\_
\_ I need a (Perl) regular expression that lops the first
\_ word off a string which could possibly (does, in fact)
\_ include newlines.
\_
\_ In the problem in question, the string data should only
\_ contain digits. What is happening is that something is
\_ causing a string to look like this: "234\n234", a
\_ duplicate of its original self, with a newline
\_ inserted, and because it's necessarily a string for
\_ various operational reasons, that's how it's being
\_ written to a database.
\_
\_ I found this solution myself, a two-step operation.
\_ Given that $id="234\n234":
\_
\_ $id =~ s/\D/ /g;
\_ $id =~ s/\D.*$//g;
\_
\_ Works (I think).
\_
\_ The first line changes non-digit characters to spaces,
\_ giving me an ordinary space-delimited string.
\_
\_ The second line swacks off everything at the end
\_ starting from the first non-digit character.
\_
\_ In the other solutions I tried, I couldn't deal with
\_ the newline.
\_
\_ Maybe someone can see a simpler solution.
\_
\_ Extra credit with the usual PLUG prize goes to someone
\_ who can generalize it so that it works with a string of
\_ any type, i.e., to produce just the first "word" of the
\_ original string, i.e., everything up to the first space
\_ or newline, with the rest discarded.
my ($wordy_thing) = ($string =~ /^(\S+)/); # ?
In particulary if you're looking at all digits up to the first non
digit, you could use /^(\d+)/.
I think the RE symbol you missed was the ^ anchor.
YMMV.
David