Thought of that, the overhead is worse then scraping, parsing, and searching. On Fri, Jul 31, 2009 at 7:51 AM, Lisa Kachold wrote: > Try using google? > > On 7/31/09, Bryan O'Neal wrote: > > Ok, so I want to, with utmost efficacy, go through a web pages and ask > how > > many of a set of key words is in that web page. Does any one know of a > good > > open source tool for this? > > I have hundreds of web pages and a near equal number of key word sets so > > scraping each page, parsing to create a vector of strings and doing a a > set > > of nested for loop to run through each vector and compare to words in the > > key word vector is, well, FAR from efficient. > > I heard of Apache velocity, but that seems to be for creating pages on > the > > fly. I also heard of Apache lucene, but appears to be for implementing > your > > own query engine on your application server (to index and query your > pages) > > > > Also, if you know of a local ACTIVE java forum I would love to know about > > it. I have subscribed to a half dozen lists and there is nothing but > > silence. > > > > Thanks a bunch :) > > > > > -- > > (623)239-3392 > (503)754-4452 www.obnosis.com > --------------------------------------------------- > PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us > To subscribe, unsubscribe, or to change your mail settings: > http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss >