How does Yahoo/Google find unknown domains?

Lyle Giese lyle at lcrcomputer.net
Mon Nov 8 01:44:26 UTC 2010


Michelle Konzack wrote:
> Hello experts and *,
>
> I have (since several years) collected some domain names  which  do  not
> exist (since years) and registered it  in  the  last  4  month  for  the
> internal use of my Internet Service.
>
> Now I see Googlebot, Yahoo and  <he.net>  quering  my  DNS  Servers  for
> exactly those domains.
>
> If I read the conditions of Networksolutions and Co, spidering of  WHOIS
> records is prohibited also the commercial use of the data.
>
> Does someone have an experience with his crap?
>
> Unfortunately I can not deny access to the 180 servers and Google, Yahoo
> and He is bombing my network with  to  much  useless  requests.  I  have
> written a mail to Google not to attack  my  network  of  VOIP  and  IPTV
> servers, but they continue...
>
> The webservers have only an SHTTP administrativ VHost, but not <exp.com>
> or <www.exp.com> but the webserver get  any  requests  from  <*.exp.com>
> because it is an administrative VServer and the error logfile is per day
> VERY long.
>
> An htaccess does not work, because I have more then 800 VHosts  on  each
> server.
>
> Thanks, Greetings and nice Day/Evening
>     Michelle Konzack
>
>   
Somewhere someone tries to access that domain name for some reason and
their dns servers make a note of that and they harvest that info( just a
wild a** guess).  On the other hand, I have seen where somebody at NS
gave a copy of their WHOIS data for 'research' purposes.  Technically,
the webinterface to the WHOIS data is what that restriction is referring
to.  Not necessarily to disallow someone from asking for/paying for
access to that data via another means. 

Again, I have no inside knowledge nor do I claim any special knowledge
or access in this area.

Yahoo's Slurp is a misbehaved robot(IMHO).  But it does honor
robots.txt.  I also put in an index.html that redirects accidential
visitors to my commerical business homepage.

Lyle Giese
LCR Computer Services, Inc.




More information about the bind-users mailing list