regexp problem

David A. Sinck plug-discuss@lists.plug.phoenix.az.us
Wed, 11 Dec 2002 12:57:06 -0700


\_ SMTP quoth Lynn David Newton on 12/11/2002 12:30 as having spake thusly:
\_
\_ 
\_ Greetings,
\_ 
\_ I need a (Perl) regular expression that lops the first
\_ word off a string which could possibly (does, in fact)
\_ include newlines.
\_ 
\_ In the problem in question, the string data should only
\_ contain digits. What is happening is that something is
\_ causing a string to look like this: "234\n234", a
\_ duplicate of its original self, with a newline
\_ inserted, and because it's necessarily a string for
\_ various operational reasons, that's how it's being
\_ written to a database.
\_ 
\_ I found this solution myself, a two-step operation.
\_ Given that $id="234\n234":
\_ 
\_ $id =~ s/\D/ /g;
\_ $id =~ s/\D.*$//g;
\_ 
\_ Works (I think).
\_ 
\_ The first line changes non-digit characters to spaces,
\_ giving me an ordinary space-delimited string.
\_ 
\_ The second line swacks off everything at the end
\_ starting from the first non-digit character.
\_ 
\_ In the other solutions I tried, I couldn't deal with
\_ the newline.
\_ 
\_ Maybe someone can see a simpler solution. 
\_ 
\_ Extra credit with the usual PLUG prize goes to someone
\_ who can generalize it so that it works with a string of
\_ any type, i.e., to produce just the first "word" of the
\_ original string, i.e., everything up to the first space
\_ or newline, with the rest discarded.

my ($wordy_thing) = ($string =~ /^(\S+)/);  # ?

In particulary if you're looking at all digits up to the first non
digit, you could use /^(\d+)/.

I think the RE symbol you missed was the ^ anchor.

YMMV.

David