Web page capture util

der.hans PLUGd at LuftHans.com
Mon Aug 28 16:25:01 MST 2006


Am 28. Aug, 2006 schwätzte Shawn Badger so:

> That doesn't look like it will work for me. Based on you suggestion
> about Firefox extensions, causes me to wonder if there is an extension
> for Firefox that will allow you to script it from the command line.  I
> know you can specify how big and which page to open to from the command
> line, but it would work for me if I could have it open a page and then
> save the complete page to a directory. Then I could us a different
> unknown app to convert that information to a pdf or something.

Description: XML-to-any converter

  xmlto is a front-end to an XSL toolchain. It chooses an appropriate
  stylesheet for the conversion you want and applies it using an external
  XSLT processor (currently, only xsltproc is supported). It also performs
  any necessary post-processing.
  .
  It supports converting from DocBook XML to DVI, XSL-FO, HTML (multiple
  pages), HTML (one page), man page, PDF, PostScript, and plain text. It
  supports converting from XSL-FO to DVI, PDF and PostScript.

  See the homepage for more information: http://cyberelk.net/tim/xmlto/

For the suggestion from slide there is ps2pdf. It's in the gs-common
package on debian.

I'm pretty certain there's an extension to save a page via a client-run
cron of some sort.

Maybe ScrapBook or another extension has a way to call it from the command
line.

https://addons.mozilla.org/firefox/427/

konqueror might have a way to save pages from the command line. Poke
around with dcop or kdcop :).

ciao,

der.hans

>
>
> On Mon, 2006-08-28 at 14:51 -0700, der.hans wrote:
>> Am 28. Aug, 2006 schwätzte alex at crackpot.org so:
>>
>>> Quoting Shawn Badger <sbadger at cskauto.com>:
>>>
>>>> Does anyone know of a CLI app that can capture a web page to a jpg or
>>>> better a pdf? I need to capture a dynamic page on daily basis and e-mail
>>>> the captured image to various people. I have tried using wget, but it
>>>> saves some weird results. I suspect that is because the page I am
>>>> polling is generated with PHP.
>>>>
>>>> Any ideas would be appreciated.
>>>
>>> wget downloads/saves the source text of the page, so if you want a jpg or
>>> pdf, I don't think that will help you.
>>>
>>> This app is web-based, not CLI, but it might be worth looking at :
>>> http://bluga.net/webthumb/index.php
>>
>> Josh is here in town ( one of the people behind AzPHP ), so maybe you can
>> convince him to let you use it on your company's intranet.
>>
>> Also, there are some extensions for Firefox that will run scripts. Maybe
>> you could setup Firefox to pull up the page and save it to PDF. A script
>> run from cron could then poll for new files to mail out.
>>
>> PrefBar is the extension I can think of off the top of my head. Not
>> available from the Firefox website, but still worth looking into.
>>
>> Is there a command line option to Firefox to save a URL to PDF?
>>
>> ciao,
>>
>> der.hans
>> --
>> #  https://www.LuftHans.com/        http://www.CiscoLearning.org/
>> #  Join the League of Professional System Administrators! https://LOPSA.org/
>> #  C'est la Net - der.hans
>> --------------------------------------------------- PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change you mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change  you mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>

-- 
#  https://www.LuftHans.com/        http://www.CiscoLearning.org/
#  Join the League of Professional System Administrators! https://LOPSA.org/
#  <allbery_b> wouldn't that be "shopping is hard, let's do math"?


More information about the PLUG-discuss mailing list