Python anyone?

Joseph T. Tannenbaum tannenba@futureone.com
Wed, 7 Jun 2000 16:31:43 -0700


Hello,

I am going through Mastering Regular Expressions by Jeffrey Friedl (O'Reilly
book)
and I got to a snippet of using a regular expression in python.  I am using
python
1.5.2 for win98, and this prog doen't work correctly. (it is supposed to
point out
double words in a text file ei "the the world."  The main part I am in doubt
about
is setting up 'data' prior to printing.  Is this format correct?  It is on
pg 57 of
the book, and the reg3 line is correct by the errata.

Thanks
Joe

Here is the snippet:

import sys; import regex; import regsub

### Prepare the three regeses w'll use
reg1 = regex.compile(
			'\\b\([a-z]+\)\(\([\n\r\t\f\v ]\|<[^>]+>\)+\)\(\\1\\b\)',
			regex.casefold)
reg2 = regex.compile('^\([^033]*\n\)+')
reg3 = regex.compile('^\(.\)')

for filename in sys.argv[1:]:           	 	# for each file...
	try:
		file = open(filename)             	 	# try opening file
	except IOError, info:
		print '%s: %s' % (filename, info[1])	# report error if couldn't
		continue											# and also abort this iteration.

	data = file.read()								# Slurp the whole file to 'data', apply regexes
and print
	data = regsub.gsub(reg1, '\033[7m\\1\033[m\2\033[7m\\4\033[m', data)
	data = regsub.gsub(reg2, '', data)
	data = regsub.gsub(reg3, filename + ':  \\1', data)
	print data,