Re: Traffic control

To: debian-firewall@lists.debian.org
Subject: Re: Traffic control
From: "Arne P. Boettger" <apb@wohnheim.fh-wedel.de>
Date: Fri, 13 Dec 2002 11:43:51 +0100
Message-id: <[🔎] 20021213104351.GB427@rix.fh-wedel.de>
In-reply-to: <[🔎] 20021213100337.GB23496@amidala>
References: <[🔎] 20021212150824.38161.qmail@mail.com> <[🔎] 20021213100337.GB23496@amidala>

On Fri, Dec 13, 2002 at 06:03:38PM +0800, Bernard Blackham wrote:
> On Thu, Dec 12, 2002 at 10:08:23AM -0500, Marco Antonio wrote:
> > Now we are facing a problem: some people are making 'automated
> > searches' on our www server -an ugly IIS5 :), and we intend to
> > block this kind of search.
> 
> Check your logs and see what the User-Agent header is for those
> requests - you may be lucky and have a hand full like "WebSpider" or
> "Googlebot" or similar. 
> 
> If this is the case, you can drop squid (or another proxy if you
> prefer) on your firewall, set it up to transparently proxy for the
> web server, and tell squid to deny requests with those User-Agent
> headers. If you're lucky! ;)

if that is the case, putting up a proxy server is overkill. Every
reasonable bot checks for a robots.txt in the server's root
directory. By creating this and asking them not to catalogue your
website you should be able to keep them from doing this.

Ciao, Arne.
-- 
 ,``o. OpenBSD        -        Debian GNU/Linux        -        Solaris  >o)
>( ,c@ GPG 1024D/913C2F81 2000-10-11  Arne P. Boettger <apb@createx.de>  /\\
 ',,,' Fingerprint = 6ED9 9A64 CD8A EB6F D841  0391 2F08 8F86 913C 2F81 _\_V

Attachment: pgpSyOLbC0Kin.pgp
Description: PGP signature

Reply to:

References:
- Traffic control
  - From: "Marco Antonio" <marcoantonio@london.com>
- Re: Traffic control
  - From: Bernard Blackham <b-debian@blackham.com.au>

Prev by Date: Re: Traffic control
Next by Date: undelivered mail.
Previous by thread: Re: Traffic control
Next by thread: undelivered mail.
Index(es):
- Date
- Thread