Web Scraping

der.hans PLUGd at LuftHans.com
Tue Oct 10 23:50:32 MST 2006


Am 10. Oct, 2006 schwätzte Joshua Zeidner so:

> Does anyone here have any experience developing Web Scraping
> applications?  any suggestions for tools?  I can manage Perl, PHP, Java,
> Python, and C/C++.  Recommendations are welcome.

moin moin Josh,

I used WWW::Mechanize for a project and was quite happy with it. It
doesn't handle javascript, but none of the projects I found did. For the
project I had to first login, then scrape multiple pages in order to get
the data I wanted.

I believe there's a Firefox module that allows you to automate replaying
mouse/keyboard events. That's likely region-based rather than parse-based,
but it might handle javascript.

libwww-mechanize-perl - Automate interaction with websites

ciao,

der.hans
-- 
#  https://www.LuftHans.com/        http://www.CiscoLearning.org/
#  Join the League of Professional System Administrators! https://LOPSA.org/
#  "If I want my children to work hard, I better be the hardest working
#  person they've ever met. If I want the children to be nice, I better
#  be the kindest human being they've ever met." -- Rafe Esquith


More information about the PLUG-discuss mailing list