Java web page key word search protocal

Bryan O'Neal boneal at cornerstonehome.com
Sat Aug 1 07:09:11 MST 2009


Thought of that, the overhead is worse then scraping, parsing, and
searching.

On Fri, Jul 31, 2009 at 7:51 AM, Lisa Kachold <lisakachold at obnosis.com>wrote:

> Try using google?
>
> On 7/31/09, Bryan O'Neal <boneal at cornerstonehome.com> wrote:
> > Ok, so I want to, with utmost efficacy, go through a web pages and ask
> how
> > many of a set of key words is in that web page. Does any one know of a
> good
> > open source tool for this?
> > I have hundreds of web pages and a near equal number of key word sets so
> > scraping each page, parsing to create a vector of strings and doing a a
> set
> > of nested for loop to run through each vector and compare to words in the
> > key word vector is, well, FAR from efficient.
> > I heard of Apache velocity, but that seems to be for creating pages on
> the
> > fly. I also heard of Apache lucene, but appears to be for implementing
> your
> > own query engine on your application server (to index and query your
> pages)
> >
> > Also, if you know of a local ACTIVE java forum I would love to know about
> > it. I have subscribed to a half dozen lists and there is nothing but
> > silence.
> >
> > Thanks a bunch :)
> >
>
>
> --
>
> (623)239-3392
> (503)754-4452 www.obnosis.com
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.PLUG.phoenix.az.us/pipermail/plug-discuss/attachments/20090801/45f36f71/attachment.htm 


More information about the PLUG-discuss mailing list