"Joseph T. Tannenbaum" wrote: > > Hello, > > I am going through Mastering Regular Expressions by Jeffrey Friedl (O'Reilly > book) > and I got to a snippet of using a regular expression in python. I am using > python > 1.5.2 for win98, and this prog doen't work correctly. (it is supposed to > point out > double words in a text file ei "the the world." The main part I am in doubt > about > is setting up 'data' prior to printing. Is this format correct? It is on > pg 57 of > the book, and the reg3 line is correct by the errata. > > Thanks > Joe > > Here is the snippet: > > import sys; import regex; import regsub > > ### Prepare the three regeses w'll use > reg1 = regex.compile( > '\\b\([a-z]+\)\(\([\n\r\t\f\v ]\|<[^>]+>\)+\)\(\\1\\b\)', > regex.casefold) > reg2 = regex.compile('^\([^033]*\n\)+') ^\033 This my be just a post typo, but... > reg3 = regex.compile('^\(.\)') > > for filename in sys.argv[1:]: # for each file... > try: > file = open(filename) # try opening file > except IOError, info: > print '%s: %s' % (filename, info[1]) # report error if couldn't > continue # and also abort this iteration. > > data = file.read() # Slurp the whole file to 'data', apply regexes > and print > data = regsub.gsub(reg1, '\033[7m\\1\033[m\2\033[7m\\4\033[m', data) ^^^ ^^^ ^^^ ^^^ Please excuse ignorance, but what are these? > data = regsub.gsub(reg2, '', data) > data = regsub.gsub(reg3, filename + ': \\1', data) > print data, ^ I guess this just depends on what's next? Note: Don't really know jack diddly bout Python. Just attracted by the smell of regexs. Saaalute!! MD > > _______________________________________________ > Plug-discuss mailing list - Plug-discuss@lists.PLUG.phoenix.az.us > http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss