Re: How to capture the text contents on a webpage "inside" f…

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Kurt Granroth
Date:  
To: joe, Main PLUG discussion list
Subject: Re: How to capture the text contents on a webpage "inside" frame?
On Jul 8, 2005, at 3:20 PM, Josef Lowder wrote:
> I have a web page open in Konqueror that has a large text file open
> within
> a "frame" window that is inside the host frame. (Not sure if I'm
> using the
> right terms, here.)
>
> I want to capture the contents of this file, but haven't been able
> to do so.
> I have plenty of memory and have captured other files of a similar
> size with
> no problem, but there is something about this one that I can't get.

[snip]
> I also tried view document source thinking that perhaps I could
> copy that
> and clean up all the superfluous html code, but that also doesn't
> work.


Actually, I'm surprised that that method didn't work. In 99% of the
cases, all you need to do is look for "frame src='whatever.html'" and
just get whatever.html directly. I ran into one case a while back
where I absolutely could not directly access the contents of a frame
no matter what I tried... but that's the extreme rarity.

> Any suggestions? Perhaps there is some way to capture whatever
> text is
> suspended in memory?


If all else fails, you could try getting the file directly out of
Konqueror's cache. I don't know if the cache is found in different
places on different systems or not. On my SuSE 9.3 installation, the
cache is in $HOME/.kde/cache-<HOST>/http.

Try something like:

find $HOME/.kde/cache-<HOST>/http -name "*THE_SITE_HOSTNAME*" | xargs
fgrep "SOME TEXT FROM THE FILE"

That should do it.

Kurt
---------------------------------------------------
PLUG-discuss mailing list -
To subscribe, unsubscribe, or to change you mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss