Re: Web crawling

Página superior
Adjuntos:
Obtener este mensaje como un correo
+ (text/plain)
Eliminar este mensaje
Responder a este mensaje
Autor: Craig White
Fecha:  
A: plug-discuss
Asunto: Re: Web crawling
On Mon, 2005-02-28 at 08:32 -0700, Nathan England wrote:
> I asked this question, because I can only get wget to use spider if I provide
> a file for it to follow. Otherwise it only reports back with the index.html
> then exits... I just can't get it to work.
>

---
yeah - in re-reading the man page for wget...
---
--spider
When invoked with this option, Wget will behave as a Web spider, which
means that it will not download the pages, just check that they are
there. For example, you can use Wget to check your bookmarks:

        wget --spider --force-html -i bookmarks.html


This feature needs much more work for Wget to get close to the
functionality of real web spiders.
---
probably need a different 'spider' - freshmeat.net returns 27 links to
the search for 'spider' and 19 for 'web spider'

Craig

---------------------------------------------------
PLUG-discuss mailing list -
To subscribe, unsubscribe, or to change you mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss