regular expressions

Julian M Catchen julian@catchen.org
Mon, 19 Feb 2001 23:23:54 -0700


Thanks to both of you for your help!  I have got it going now.  A few lines
of sample code in the manual would of gone a long way...but hey that's what
the neighborhood LUG is for.

Turns out I had a misplaced "struct" keyword.  I was working from my other
main experience with system structs (besides sockets) and that is with the
tm struct used for time functions.  Whenever I declare it for use, I
include the "struct" keyword in the declaration, for example: struct tm
*tstruct;  But I guess that is not how it is done for the regex library...

Anyway -- thanks.


On Mon, 19 Feb 2001 22:52:34 Kevin Buettner wrote:
> On Feb 19,  6:06pm, Julian M Catchen wrote:
> 
> > Ok, I was confused, the rx library is included by default in glibc. 
> The
> > problem I am having is the following:
> > 
> > To use the regex functions, I have to create a structure, regex_t.  If
> i
> > just declare it like this: 
> > 
> > struct regex_t rx;
> > 
> > The compiler errors out saying "storage size of `rx' isn't known".
> > 
> > Can anyone give me some pointers on how to malloc this thing?
> 
> Do you have a ``#include <regex.h>'' statement prior to your
> declaration for ``rx''?
> 
> Below is a simple program which demonstrates the use of the functions
> documented on the recomp() man page.  To try it, put it in a file
> called simple-grep.c and do the following:
> 
> ocotillo:ctests$ gcc -Wall -o simple-grep -g simple-grep.c 
> ocotillo:ctests$ ./simple-grep 'reg(exec|comp)' <simple-grep.c 
> Line 21:   errcode = regcomp (&rx, argv[1], REG_EXTENDED | REG_NOSUB);
> Line 37:       if (regexec (&rx, line, 0, 0, 0) == 0)
> 
> --- simple-grep.c ---
> #include <sys/types.h>
> #include <stdio.h>
> #include <regex.h>
> 
> #define MAXLINESIZE 4096
> 
> int
> main (int argc, char **argv)
> {
>   regex_t rx;
>   int errcode;
>   char line[MAXLINESIZE];
>   int linenum;
> 
>   if (argc != 2)
>     {
>       fprintf (stderr, "Usage: $s pattern\n");
>       exit (1);
>     }
> 
>   errcode = regcomp (&rx, argv[1], REG_EXTENDED | REG_NOSUB);
>   if (errcode != 0)
>     {
>       char *buf;
>       size_t bufsize;
> 
>       bufsize = regerror (errcode, &rx, 0, 0);
>       buf = alloca (bufsize);
>       regerror (errcode, &rx, buf, bufsize);
>       fprintf (stderr, "Error compiling pattern: %s\n", buf);
>       exit (1);
>     }
> 
>   linenum = 1;
>   while (fgets (line, MAXLINESIZE, stdin))
>     {
>       if (regexec (&rx, line, 0, 0, 0) == 0)
> 	{
> 	  printf ("Line %d: %s", linenum, line);
> 	  if (line[strlen (line) - 1] != '\n')
> 	    printf ("\n");
> 	}
> 
>       if (strlen (line) != MAXLINESIZE - 1 || line[MAXLINESIZE - 2] ==
> '\n')
> 	linenum++;
>     }
> 
>   regfree (&rx);
>   exit (0);
> }
> --- end simple-grep.c ---
> 
> By way of comparison the above C program is roughly comparable to the
> following perl program:
> 
> --- simple-grep.pl ---
> #!/usr/bin/perl -w
> 
> $pattern = shift;
> 
> while (<>) {
>   chomp;
>   print "$ARGV, $.:$_\n"	if /$pattern/;
> }
> --- end simple-grep.pl ---
> 
> Actually, the perl solution is superior since it is capable of reading
> from files whose names are supplied from the command line as well as
> STDIN.  Also, it deals with long lines much less clumsily.
> 
> Here's example of the perl program in action:
> 
> ocotillo:ctests$ ./simple-grep.pl 'reg(exec|comp)' <simple-grep.c 
> -, 21:  errcode = regcomp (&rx, argv[1], REG_EXTENDED | REG_NOSUB);
> -, 37:      if (regexec (&rx, line, 0, 0, 0) == 0)
> ocotillo:ctests$ ./simple-grep.pl 'reg(exec|comp)' simple-grep.c 
> simple-grep.c, 21:  errcode = regcomp (&rx, argv[1], REG_EXTENDED |
> REG_NOSUB);
> simple-grep.c, 37:      if (regexec (&rx, line, 0, 0, 0) == 0)
> 
> HTH,
> 
> Kevin
> 
> ________________________________________________
> See http://PLUG.phoenix.az.us/navigator-mail.shtml if your mail doesn't
> post to the list quickly and you use Netscape to write mail.
> 
> Plug-discuss mailing list  -  Plug-discuss@lists.PLUG.phoenix.az.us
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
> 
> 
--