perl regex
Trent Shipley
plug-discuss@lists.plug.phoenix.az.us
Wed, 26 Mar 2003 13:33:55 -0700
- Assume the ':' character *never* occurs in data.
- Assume there are no quote or escape sequences.
- A line is delimited by by PERL's magic "\n"
- 'FIELD' is not really a constant but a non-empty string of printable,
not-blank characters, none of which is ':'
my $field_title = qr/[^\s:]+:/;
#should be legal
# same as
use POSIX;
$field_title = qr/[^:[:space:]]+:/;
# should definitely be legal
I see three cases for separators, one is implicit.
my $first_separator = qr/\A$field_title/;
my $middle_separator = qr/, $field_title/;
#my $end_separator = qr/\Z/; # the implicit "\n".
Data is between separators with one of three forms.
my $first_datum = m/$first_separator
[^:]
$middle_separator
/x;
my $middle_datum = m/$middle_separator
[^:]
$middle_separator
/x;
my $last_datum = m/$middle_separator
[^:]
\Z
/x;
Since regexes are Turing complete, you *could* get all of this into one regex.
But why bother?
On Wednesday 2003-03-26 09:11, Mike Starke wrote:
> I am still struggling with an expression to parse
> out a file of the following format:
>
> FIELD: data, FIELD: more, data, FIELD: data ....
> FIELD: data, FIELD: more, data, FIELD: data ....
> FIELD: data, FIELD: more, data, FIELD: data ....
> FIELD: data, FIELD: more, data, FIELD: data ....
>
> I thought I had it, but notice above how some data has
> a comma contained within; that pretty much ruined
> my 'split' function :-)
>
> Each line begins with the word 'TYPE', so I been able to weed
> out everything in the file that is not relevant
> by using something like:
> if ($_ =~ /^TYPE.*/) {
>
> Beyond this, I just can not seem to get the proper
> expression to grab the fields and their cooresponding data.
>
> Anyone?
>
> v/r
> -Mike
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change you mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss