Re: Web page capture util

Top Page
Attachments:
Message as email
+ (text/plain)
+ (text/plain)
Delete this message
Reply to this message
Author: der.hans
Date:  
To: Main PLUG discussion list
Subject: Re: Web page capture util
Am 28. Aug, 2006 schwätzte Shawn Badger so:

> That doesn't look like it will work for me. Based on you suggestion
> about Firefox extensions, causes me to wonder if there is an extension
> for Firefox that will allow you to script it from the command line. I
> know you can specify how big and which page to open to from the command
> line, but it would work for me if I could have it open a page and then
> save the complete page to a directory. Then I could us a different
> unknown app to convert that information to a pdf or something.


Description: XML-to-any converter

xmlto is a front-end to an XSL toolchain. It chooses an appropriate
stylesheet for the conversion you want and applies it using an external
XSLT processor (currently, only xsltproc is supported). It also performs
any necessary post-processing.
.
It supports converting from DocBook XML to DVI, XSL-FO, HTML (multiple
pages), HTML (one page), man page, PDF, PostScript, and plain text. It
supports converting from XSL-FO to DVI, PDF and PostScript.

See the homepage for more information: http://cyberelk.net/tim/xmlto/

For the suggestion from slide there is ps2pdf. It's in the gs-common
package on debian.

I'm pretty certain there's an extension to save a page via a client-run
cron of some sort.

Maybe ScrapBook or another extension has a way to call it from the command
line.

https://addons.mozilla.org/firefox/427/

konqueror might have a way to save pages from the command line. Poke
around with dcop or kdcop :).

ciao,

der.hans

>
>
> On Mon, 2006-08-28 at 14:51 -0700, der.hans wrote:
>> Am 28. Aug, 2006 schwätzte so:
>>
>>> Quoting Shawn Badger <>:
>>>
>>>> Does anyone know of a CLI app that can capture a web page to a jpg or
>>>> better a pdf? I need to capture a dynamic page on daily basis and e-mail
>>>> the captured image to various people. I have tried using wget, but it
>>>> saves some weird results. I suspect that is because the page I am
>>>> polling is generated with PHP.
>>>>
>>>> Any ideas would be appreciated.
>>>
>>> wget downloads/saves the source text of the page, so if you want a jpg or
>>> pdf, I don't think that will help you.
>>>
>>> This app is web-based, not CLI, but it might be worth looking at :
>>> http://bluga.net/webthumb/index.php
>>
>> Josh is here in town ( one of the people behind AzPHP ), so maybe you can
>> convince him to let you use it on your company's intranet.
>>
>> Also, there are some extensions for Firefox that will run scripts. Maybe
>> you could setup Firefox to pull up the page and save it to PDF. A script
>> run from cron could then poll for new files to mail out.
>>
>> PrefBar is the extension I can think of off the top of my head. Not
>> available from the Firefox website, but still worth looking into.
>>
>> Is there a command line option to Firefox to save a URL to PDF?
>>
>> ciao,
>>
>> der.hans
>> --
>> #  https://www.LuftHans.com/        http://www.CiscoLearning.org/
>> #  Join the League of Professional System Administrators! https://LOPSA.org/
>> #  C'est la Net - der.hans
>> --------------------------------------------------- PLUG-discuss mailing list -  To subscribe, unsubscribe, or to change you mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

>
> ---------------------------------------------------
> PLUG-discuss mailing list -
> To subscribe, unsubscribe, or to change you mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>


-- 
#  https://www.LuftHans.com/        http://www.CiscoLearning.org/
#  Join the League of Professional System Administrators! https://LOPSA.org/
#  <allbery_b> wouldn't that be "shopping is hard, let's do math"?
---------------------------------------------------
PLUG-discuss mailing list -
To subscribe, unsubscribe, or to change you mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss