On Fri, Dec 13, 2002 at 06:03:38PM +0800, Bernard Blackham wrote: > On Thu, Dec 12, 2002 at 10:08:23AM -0500, Marco Antonio wrote: > > Now we are facing a problem: some people are making 'automated > > searches' on our www server -an ugly IIS5 :), and we intend to > > block this kind of search. > > Check your logs and see what the User-Agent header is for those > requests - you may be lucky and have a hand full like "WebSpider" or > "Googlebot" or similar. > > If this is the case, you can drop squid (or another proxy if you > prefer) on your firewall, set it up to transparently proxy for the > web server, and tell squid to deny requests with those User-Agent > headers. If you're lucky! ;) if that is the case, putting up a proxy server is overkill. Every reasonable bot checks for a robots.txt in the server's root directory. By creating this and asking them not to catalogue your website you should be able to keep them from doing this. Ciao, Arne. -- ,``o. OpenBSD - Debian GNU/Linux - Solaris >o) >( ,c@ GPG 1024D/913C2F81 2000-10-11 Arne P. Boettger <apb@createx.de> /\\ ',,,' Fingerprint = 6ED9 9A64 CD8A EB6F D841 0391 2F08 8F86 913C 2F81 _\_V
Attachment:
pgpSyOLbC0Kin.pgp
Description: PGP signature