Help with Regular Expression

David A. Sinck plug-discuss@lists.plug.phoenix.az.us
Tue, 3 Dec 2002 11:21:03 -0700


\_ SMTP quoth az_pete@cactusfamily.com on 12/3/2002 11:04 as having spake thusly:
\_
\_ Hi All, 
\_ 
\_ I seem to be having a lot of trouble with what seems should be a
\_ simple regex.
\_ 
\_ I have a database full of research paper abstracts and I would like
\_ to strip all newlines from them. This would include \n, \r, and
\_ \r\n characters.  However, if there are two consecutive newlines
\_ (i.e. new paragraph) I would like to keep those in tact.
\_ 
\_ I have written the script in PHP to pull each field from the
\_ database, perform said regex and then update the field with the new
\_ data.  All I need is a regex that works.  I'm using the Perl
\_ compatible regex within PHP.
\_ 
\_ Any help would be appreciated.

I'd do two passes for ease of thought:

s/\r//g;  # lose all \r's, regardless

s/[^\n][\n][^\n]/ /g;  # non-newline newline non-newline goes to space

YMMV.

Trying to do both in one could prove more amusing and is left as an
exercise for the reader.

Backups are your friend.

David