Web Scraping
der.hans
PLUGd at LuftHans.com
Tue Oct 10 23:50:32 MST 2006
Am 10. Oct, 2006 schwätzte Joshua Zeidner so:
> Does anyone here have any experience developing Web Scraping
> applications? any suggestions for tools? I can manage Perl, PHP, Java,
> Python, and C/C++. Recommendations are welcome.
moin moin Josh,
I used WWW::Mechanize for a project and was quite happy with it. It
doesn't handle javascript, but none of the projects I found did. For the
project I had to first login, then scrape multiple pages in order to get
the data I wanted.
I believe there's a Firefox module that allows you to automate replaying
mouse/keyboard events. That's likely region-based rather than parse-based,
but it might handle javascript.
libwww-mechanize-perl - Automate interaction with websites
ciao,
der.hans
--
# https://www.LuftHans.com/ http://www.CiscoLearning.org/
# Join the League of Professional System Administrators! https://LOPSA.org/
# "If I want my children to work hard, I better be the hardest working
# person they've ever met. If I want the children to be nice, I better
# be the kindest human being they've ever met." -- Rafe Esquith
More information about the PLUG-discuss
mailing list