perl regex

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Trent Shipley
Date:  
Subject: perl regex
- Assume the ':' character *never* occurs in data.
- Assume there are no quote or escape sequences.
- A line is delimited by by PERL's magic "\n"

- 'FIELD' is not really a constant but a non-empty string of printable,
not-blank characters, none of which is ':'

my $field_title = qr/[^\s:]+:/;  
     #should be legal
# same as



use POSIX;

$field_title = qr/[^:[:space:]]+:/;  
      # should definitely be legal  



I see three cases for separators, one is implicit.

my $first_separator = qr/\A$field_title/;
my $middle_separator = qr/, $field_title/;
#my $end_separator = qr/\Z/; # the implicit "\n".


Data is between separators with one of three forms.


my $first_datum      = m/$first_separator
                                    [^:]
                                    $middle_separator
                                    /x;
my $middle_datum  = m/$middle_separator
                                    [^:]
                                    $middle_separator
                                    /x;
my $last_datum     = m/$middle_separator
                                    [^:]
                                    \Z
                                    /x;



Since regexes are Turing complete, you *could* get all of this into one regex.
But why bother?


On Wednesday 2003-03-26 09:11, Mike Starke wrote:
> I am still struggling with an expression to parse
> out a file of the following format:
>
> FIELD: data, FIELD: more, data, FIELD: data ....
> FIELD: data, FIELD: more, data, FIELD: data ....
> FIELD: data, FIELD: more, data, FIELD: data ....
> FIELD: data, FIELD: more, data, FIELD: data ....
>
> I thought I had it, but notice above how some data has
> a comma contained within; that pretty much ruined
> my 'split' function :-)
>
> Each line begins with the word 'TYPE', so I been able to weed
> out everything in the file that is not relevant
> by using something like:
> if ($_ =~ /^TYPE.*/) {
>
> Beyond this, I just can not seem to get the proper
> expression to grab the fields and their cooresponding data.
>
> Anyone?
>
> v/r
> -Mike
> ---------------------------------------------------
> PLUG-discuss mailing list -
> To subscribe, unsubscribe, or to change you mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss