SERVFAIL for some domains on some servers

Oliver Henriot Oliver.Henriot at imag.fr
Tue Mar 2 20:12:23 UTC 2010


Dear Stan,

Dans sa grande sagesse, Sten Carlsen a écrit, le 02/03/2010 18:43 :
> Hi
>
> Seen from here, I get a fast response to dig "labanquepostale.fr".
>    
When you query your own servers?
I get a fast response from isis.imag.fr and imag.imag.fr; it's only 
brahma.imag.fr and cosmos.imag.fr who time out.

> Using your servers 1 - 4 shows only that they do not answer recursive
> queries, that is diferent from my first try but a good change.
>    

They only allow recursive queries for clients on our networks. We don't 
have open recursive servers so that's normal.

> To have good resolution a nameserver must have access via:
> UDP from any local port to remote port 53, it must have the possibility
> to use EDNS0 ->  packet size over 512bytes
> TCP from any local port to remote port 53.
>    

OK, thank you for those important technical precisions. It might indeed 
be a network problem on our side.

> Being able to telnet only shows TCP connectivity and only from the port
> actually used when trying. UDP is the normal mode of communication.
>    

True, telnet to port 53 is a necessary but not a sufficient condition to 
guarantee there is no network problem involved.

> A firewall could block any of the above access routes ->  if TCP is open,
> you can telnet but it might still block UDP.
>
> Then examples of bad firewall designs have been seen that would modify
> packets during transit or truncate them. So there are lots of
> possibilities that a firewall can still interfere.
>
>    

OK, thank you for all these indications. I'll investigate our network 
setup in order to ensure that's all OK.

Best regards,

Oliver

>
> Oliver Henriot wrote:
>    
>> Dear Sten,
>>
>> I didn't give the domain I'm encountering problems with because it
>> seemed irrelevant to me.
>>
>> As Stéphane Bortzmeyer says in his message of 01/03/10 11:44, it's
>> best to give names, so here goes :
>> x.fr is labanquepostale.fr
>> "1" is imag.imag.fr
>> "2" is brahma.imag.fr
>> "3" is isis.imag.fr
>> "4" is cosmos.imag.fr
>>
>> As to a possible firewall problem, how could this be if the servers
>> encountering problems don't have any access problems on TCP port 53?
>>
>> Thanks.
>>
>> Oliver
>>
>> Dans sa grande sagesse, Sten Carlsen a écrit, le 27/02/10 19:06 :
>>      
>>> Since you don't tell which domain is the problem and at least I get
>>> perfect answers for imag.fr (my only possible guess) from all listed
>>> servers, I can have no clue.
>>>
>>> Best guess is still some firewall doing something stupid.
>>>
>>>
>>> Oliver Henriot wrote:
>>>        
>>>> Dear list users,
>>>>
>>>> Maybe you can help me out here. Please bear with me if I'm stating the
>>>> obvious, but my computing skills are scarce and I still have a lot to
>>>> learn.
>>>>
>>>> I have a series of name servers, some of which fail to resolve hosts
>>>> in other domains whereas others don't have any problem.
>>>>
>>>> My setup is as follows :
>>>> - server "1" : master for my domain, recursion disabled for all except
>>>> localhost. Setup is BIND 9.5.1-P2 on SunOS 5.9.
>>>> - servers "2", "3" and "4" : slaves for my domain, recusrion allowed
>>>> for all, official resolvers for my clients, same configuration on all
>>>> 3. Setup is DiG 9.3.6-P1 on CentOS 5.4.
>>>>
>>>> Servers "2" and "4" fail to resolve domain x.fr whereas "1" and "3"
>>>> have no problem (if interrogated locally for "1" of course). The error
>>>> I get is :
>>>>
>>>>
>>>> dig -t A @"2" www.x.fr
>>>>
>>>> ;<<>>   DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2<<>>   -t A @"2" www.x.fr
>>>> ; (1 server found)
>>>> ;; global options:  printcmd
>>>> ;; Got answer:
>>>> ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 37397
>>>> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
>>>>
>>>> ;; QUESTION SECTION:
>>>> ;www.x.fr.                IN      A
>>>>
>>>> ;; Query time: 4622 msec
>>>> ;; SERVER: "2"#53("2")
>>>> ;; WHEN: Sat Feb 27 18:20:07 2010
>>>> ;; MSG SIZE  rcvd: 40
>>>>
>>>>
>>>> The behavior is the same for "4" and for any host in domain x.fr (and
>>>> the domain itself).
>>>>
>>>> It's not a network problem, I can telnet on port 53 of the name
>>>> servers for domain x.fr from "2" (obviously using the ip address as
>>>> the name can't be resolved by the server).
>>>>
>>>> Also, reverse queries for hosts in domain x.fr from "2" do not fail.
>>>>
>>>> Finally, even more strange, if I use dig's +trace option servers "2"
>>>> and "4" do not fail any more and can resolve www.x.fr (although the
>>>> query lags quite a bit when doing the last bit of resolving, from x.fr
>>>> to www.x.fr).
>>>>
>>>> Here's the output :
>>>>
>>>> dig www.x.fr @"2" +trace
>>>>
>>>> ;<<>>   DiG 9.5.1-P3<<>>   www.x.fr @"2" +trace
>>>> ;; global options:  printcmd
>>>> .                       518400  IN      NS      F.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      G.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      H.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      I.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      J.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      K.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      L.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      M.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      A.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      B.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      C.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      D.ROOT-SERVERS.NET.
>>>> .                       518400  IN      NS      E.ROOT-SERVERS.NET.
>>>> ;; Received 500 bytes from "2"#53("2") in 2 ms
>>>>
>>>> fr.                     172800  IN      NS      E.EXT.NIC.fr.
>>>> fr.                     172800  IN      NS      B.EXT.NIC.fr.
>>>> fr.                     172800  IN      NS      F.EXT.NIC.fr.
>>>> fr.                     172800  IN      NS      A.NIC.fr.
>>>> fr.                     172800  IN      NS      C.NIC.fr.
>>>> fr.                     172800  IN      NS      G.EXT.NIC.fr.
>>>> fr.                     172800  IN      NS      D.NIC.fr.
>>>> fr.                     172800  IN      NS      D.EXT.NIC.fr.
>>>> ;; Received 444 bytes from 192.58.128.30#53(J.ROOT-SERVERS.NET) in
>>>> 44 ms
>>>>
>>>> x.fr.     172800  IN      NS      ns1.x.fr.
>>>> x.fr.     172800  IN      NS      ns2.x.fr.
>>>> ;; Received 108 bytes from 193.176.144.6#53(E.EXT.NIC.fr) in 33 ms
>>>>
>>>> www.x.fr. 300     IN      A       xxx.xxx.xxx.xxx
>>>> x.fr.     300     IN      NS      ns2.x.fr.
>>>> x.fr.     300     IN      NS      ns1.x.fr.
>>>> ;; Received 124 bytes from xxx.xxx.xxx.xxx#53(ns1.x.fr) in 0 ms
>>>>
>>>>
>>>> I'm at a loss as to what's going on (or wrong) here and what I can to
>>>> do to solve the problem. Any help would be greatly appreciated.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Oliver
>>>>
>>>>
>>>> _______________________________________________
>>>> bind-users mailing list
>>>> bind-users at lists.isc.org
>>>> https://lists.isc.org/mailman/listinfo/bind-users
>>>>          
>>>        
>>      
>    




More information about the bind-users mailing list