Thought of that, the overhead is worse then scraping, parsing, and searching.

On Fri, Jul 31, 2009 at 7:51 AM, Lisa Kachold <lisakachold@obnosis.com> wrote:
Try using google?

On 7/31/09, Bryan O'Neal <boneal@cornerstonehome.com> wrote:
> Ok, so I want to, with utmost efficacy, go through a web pages and ask how
> many of a set of key words is in that web page. Does any one know of a good
> open source tool for this?
> I have hundreds of web pages and a near equal number of key word sets so
> scraping each page, parsing to create a vector of strings and doing a a set
> of nested for loop to run through each vector and compare to words in the
> key word vector is, well, FAR from efficient.
> I heard of Apache velocity, but that seems to be for creating pages on the
> fly. I also heard of Apache lucene, but appears to be for implementing your
> own query engine on your application server (to index and query your pages)
>
> Also, if you know of a local ACTIVE java forum I would love to know about
> it. I have subscribed to a half dozen lists and there is nothing but
> silence.
>
> Thanks a bunch :)
>


--

(623)239-3392
(503)754-4452 www.obnosis.com
---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss