FW: [Tutor] Writing a web bot.

Top Page
Attachments:
Message as email
+ (text/plain)
Delete this message
Reply to this message
Author: Rod Roark
Date:  
Subject: FW: [Tutor] Writing a web bot.
Perhaps most servers do. However notice the capitalized word MAY in
your references. This means that the behavior is not a requirement of
the specification. Even the SHOULD behavior is not a requirement for
"conditional compliance" with the 1.1 spec.

I suppost the bot app could hope for a persistent connection provided
that it has a fallback mechanism if it fails.

Regards,

-- Rod

On Fri, 07 Jul 2000, you wrote:
> Well, that would be nice, except almost all servers are HTTP 1.1 compliant.
> And HTTP 1.1 states that connections should be left open for additional
> requests unless otherwise specified.
>
> Trust me, I learned this the hard way. Most servers will assume that the
> client wishes to send multiple requests unless the client specifies
> otherwise.
>
> >From RFC 2616:
>
> 8.1.2 Overall Operation
>
>    A significant difference between HTTP/1.1 and earlier versions of
>    HTTP is that persistent connections are the default behavior of any
>    HTTP connection. That is, unless otherwise indicated, the client
>    SHOULD assume that the server will maintain a persistent connection,
>    even after error responses from the server.

>
> ...
>
> 8.1.2.1 Negotiation
>
>    An HTTP/1.1 server MAY assume that a HTTP/1.1 client intends to
>    maintain a persistent connection unless a Connection header including
>    the connection-token "close" was sent in the request. If the server
>    chooses to close the connection immediately after sending the
>    response, it SHOULD send a Connection header including the
>    connection-token close.

>
>    An HTTP/1.1 client MAY expect a connection to remain open, but would
>    decide to keep it open based on whether the response from a server
>    contains a Connection header with the connection-token close. In case
>    the client does not want to maintain a connection for more than that
>    request, it SHOULD send a Connection header including the
>    connection-token close.

>
> ...
>
> 8.2.1 Persistent Connections and Flow Control
>
>    HTTP/1.1 servers SHOULD maintain persistent connections and use TCP's
>    flow control mechanisms to resolve temporary overloads, rather than
>    terminating connections with the expectation that clients will retry.
>    The latter technique can exacerbate network congestion.

>
> Michael J. Sheldon
> Internet Applications Developer
> Phone: 480.699.1084
> http://www.desertraven.com/
> PGP Key Available on Request
>
> -----Original Message-----
> From:
> [mailto:plug-discuss-admin@lists.PLUG.phoenix.az.us]On Behalf Of Rod
> Roark
> Sent: Friday, July 07, 2000 20:02
> To:
> Subject: RE: FW: [Tutor] Writing a web bot.
>
>
> >From the HTTP 1.0 specification: "Current practice requires that the
> connection be established by the client prior to each request and
> closed by the server after sending the response."
>
> Certainly cooperating clients and servers can behave otherwise, but the
> application in question is a bot, and no such cooperation can be
> expected.
>
> -- Rod
>
> On Fri, 07 Jul 2000, Mike Sheldon wrote:
> > Actually, HTTP does work that way. You can retrieve multiple files through
> a
> > single connection.
> >
> > Michael J. Sheldon
> > Internet Applications Developer
> > Phone: 480.699.1084
> > http://www.desertraven.com/
> > PGP Key Available on Request
> >
> > -----Original Message-----
> > From:
> > [mailto:plug-discuss-admin@lists.PLUG.phoenix.az.us]On Behalf Of Rod
> > Roark
> > Sent: Friday, July 07, 2000 18:13
> > To:
> > Subject: Re: FW: [Tutor] Writing a web bot.
> >
> >
> > HTTP doesn't work that way. The server is going to kill the connection
> > after responding to each request.
> >
> > -- Rod
> > ----------------------------------------------------------------------
> > Sunset Systems                           Preconfigured Linux Computers
> > http://www.sunsetsystems.com/                      and Custom Software
> > ----------------------------------------------------------------------

> >
> > On Fri, 07 Jul 2000, you wrote:
> > > Hi all.
> > >
> > > It appears I have found myself in a position
> > > where I could use some help.
> > >
> > > The task I am trying to perform is write an
> > > internet bot. I was going to use urllib for
> > > this project however one of the requirements
> > > is for the connection to be continuous during
> > > the session.
> > >
> > > Connect to a site.
> > > Get page, parse.
> > > Get another page, parse.
> > > use POST method, get another page, parse.
> > > Disconnect from the site.
> > >
> > > The connection is not supposed to be dropped
> > > between the requests.
> > >
> > > Is there a simple way to do this task???
> > >
> > > thanks.