Bind 9.4.2 not resolving one domain

caio elcaio at gmail.com
Thu Sep 4 19:23:11 UTC 2008


Chris Buxton escribió:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On Sep 4, 2008, at 10:58 AM, caio wrote:
>> Chris Buxton escribió:
>>> I would be more inclined to suspect network connectivity problems with
>>> the lookup you're having problems with. With that many lookups, each one
>>> needs to complete in a reasonable amount of time - 50 ms on average, or
>>> thereabouts, to complete the whole thing in 5 seconds. How is your
>>> connection to the various servers involved?
>>
>> do not know if a connectivity problem, because i have 2 name servers, at
>> the same network level hierarchy (but differents subnet).., and maybe
>> there is one working ok while the other with failure..
>>
>> here the case of the secondary ns...(at this moment):
>>
>> # dig @dns2.mydomain.com www.yahoo.com.ar +trace
> 
> [...]
> 
>> And without "+trace" argument:
>>
>> # dig @dns2.mydomain.com www.yahoo.com.ar
>>
>> ; <<>> DiG 9.4.2 <<>> @dns2.mydomain.com www.yahoo.com.ar
>> ; (1 server found)
>> ;; global options:  printcmd
>> ;; connection timed out; no servers could be reached
>>
>> Why with 'trace' the query seem to finish, and without 'trace' it fails?
> 
> 
> The "+trace" option causes dig to behave quite differently than without. 
> With "+trace", you're not really asking your server anything other than 
> for a list of root servers. Then 'dig' does all the work of recursion.
> 
> More interesting would be to repeat your previous query with "+norec" 
> added, in parallel with the recursive query. Or better yet, configure 
> logging so that we can see what's going on - but this can be hard with a 
> busy server.
> 
> The fact that you previously indicated that retrying the query a few 
> seconds later yields an answer tells me that this is some kind of 
> performance problem, most likely in network latency (as Kevin Darcy 
> originally suggested). Looking at the trace, which doesn't show 
> everything (and also terminates at the first CNAME record), I can see 
> some pretty slow response times - the response from the root server is 
> over 400 ms. Of course, your resolving name server most likely has some 
> of this already in cache, including good working RTT values for the root 
> and .com servers, among others. Therefore, it's likely that your server 
> is completing the recursion process in something like 6 seconds, just a 
> bit over dig's 5 second timeout. Try this:
> 
> dig @dns2.mydomain.com www.yahoo.com.ar +time=20
> 
> What is the result? You might do something like this for a real test:
> 
> rndc flush
> # wait 10 seconds
> dig @dns2.mydomain.com www.yahoo.com.ar +time=20
> 
> Chris Buxton

ok about your dig +trace explanation.., thanks Chris.

and here the result of:

# rndc flush
# (10 secs)
# dig @dns2.mydomain.com www.yahoo.com.ar +time=20

; <<>> DiG 9.4.2 <<>> @dns2.mydomain.com www.yahoo.com.ar +time=20
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 36748
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.yahoo.com.ar.              IN      A

;; Query time: 19738 msec
;; SERVER: <mydomain_public_ip_addr>#53(<ip_addr>)
;; WHEN: Thu Sep  4 16:06:49 2008
;; MSG SIZE  rcvd: 34

And after 2 minutes, I threw 2 parallels dig (with +norec, and rec)..., 
and I do not know how can I explain it.., but both returns successfull 
results.., but with differents query times...

# dig @dns2.mydomain.com www.yahoo.com.ar +norec

; <<>> DiG 9.4.2 <<>> @dns2.mydomain.com www.yahoo.com.ar +norec
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6147
;; flags: qr ra; QUERY: 1, ANSWER: 3, AUTHORITY: 2, ADDITIONAL: 0

;; QUESTION SECTION:
;www.yahoo.com.ar.              IN      A

;; ANSWER SECTION:
www.yahoo.com.ar.       1565    IN      CNAME   hp2.latam.g1.b.yahoo.com.
hp2.latam.g1.b.yahoo.com. 54    IN      CNAME   us.hp2.latam.a1.b.yahoo.com.
us.hp2.latam.a1.b.yahoo.com. 299 IN     A       68.142.226.230

;; AUTHORITY SECTION:
a1.b.yahoo.com.         172554  IN      NS      yf1.yahoo.com.
a1.b.yahoo.com.         172554  IN      NS      yf2.yahoo.com.

;; Query time: 0 msec
;; SERVER: <mydomain_public_ip_addr>#53(<ip_addr>)
;; WHEN: Thu Sep  4 16:10:25 2008
;; MSG SIZE  rcvd: 154

And the recursive query, the same but with:

;; Query time: 174 msec
;; WHEN: Thu Sep  4 16:10:24 2008
;; MSG SIZE  rcvd: 154


What has happened is what i want to tell you.., the randomness of this 
qname resolution with my name servers.

If everything goes 'normally' the primary name server in a while will 
start to fails, and the secondary name server will keep resolving well..

--
caio


More information about the bind-users mailing list