> What I want is way to create a list that will be used to search the
> internet and then have the results stored not the actual web pages
> just the urls and then have a way for the urls to be reviewed.
Yep; using HTTrack (and probably wget) you can feed it list of URLs
that you want crawled; you can have it "throw away" the pages and just
store the links (the URLs) in a log. In other words, it harvests
links, not pages (in this case)
If I have a long list of pages (URLs) that I need to sequence through,
I use URLSlideShow from
http://slideshow.rockhoward.com/. That's my
fast way to "review" thousands of sites. Just hit "next, next, next."
Beautiful.
That's the hands-on, do it yourself approach.
There are also products like
http://www.aignes.com/ and online
services like
http://www.changedetect.com/ that you can sign up for,
to detect changes on pages and send you an alert. These just do the
above *for* you, and spare you the details. In return, they want $$.
> Is a spider what I am looking for? I have looked at a lot of the
> spider projects and they seem to be for different uses.
Not sure what, specifically, you are stuck on. Spiders are not
mysterious. Feel free to contact me off list if you have more
questions... My advice is to try anything -- try it the hard way, or
the dumb way, but get moving -- then you will find you figure out the
rest. That's my brute-force approach to life.
(the other) Craig
---------------------------------------------------
PLUG-discuss mailing list -
PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change you mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss