BIND responding slow or not at all....

Kevin Darcy kcd at chrysler.com
Tue Jul 15 03:30:21 UTC 2008


Johan Louwers wrote:
> Hi,
> I have currently a problem with my 2 bind servers on gentoo.
> Pubmai01 is BIND 9.3.2_p1
> Pubmai02 is BIND 9.3.3
>
>
> When I do a local dig to a local address I get the response directly
> pubmai02# dig someserver.mydomain.com
>
> using the IP adres of the name server I am logged in to is also
> working and I get the answer directly
> pubmai02# dig @10.32.5.2 someserver.mydomain.com
>
> using only dig gives after some time
> pubmai02# dig
> ; <<>> Dig 9.2.3 <<>>
> ;;  global options:   printcmd
> ;;  connection timed out; no servers could be reached
> pubmai02#
>
> when I query for a external domain, lets say google.com most of the
> times I do get the same result as when I execute only dig. Sometime
> however I do get a result, it is quite random. However every time I do
> a request this way it do’s not matter if I get a result or not, I do
> get a entry in the query log.
> pubmai02# dig @10.32.5.2 google.com
>
> I can do all those things on the local machine and I can also do it
> from the other name server… same result… and always a entrie in the
> query log. However, when I try it from an other network segment like
> for example a machine with IP 10.32.2.19 I do not get a result and I
> do not see a entry in the query log. I can however access the machine
> by telnet/ssh/ftp/ping…. Etc etc….
>
> The strange thing is that those problems where there as from Saturday
> morning 7.07 suddenly and we did not have had any changes in the
> network infrastructure or on the bind servers as far as we know. We
> did however notice that emerge is scheduled in crontab by a former
> system engineer from what we did not notice until now…
> Do’s anyone can give us a clue on what this might be causing and how
> we can solve the problems?
>   
Your troubleshooting steps so far are rather disjointed.

-- Command-line utilities such as ssh/ftp/telnet/ping may or may not use 
DNS to resolve their name queries, or they may use other sources of 
naming information (e.g. /etc/hosts, NIS, etc.) *before* consulting DNS. 
On Gentoo, I believe this is controlled (as it is also on Solaris) by 
the nsswitch.conf file. In the case where a client resolves a name and 
the query doesn't show up in your query logs, see if it got the 
resolution through some other means besides DNS. If that checks out, 
then check out the resolvers listed as "nameserver"s in the client's 
/etc/resolv.conf. Is 10.32.5.2 first in the list? If not, then maybe 
some other resolver gave the answer before 10.32.5.2 was even queried. 
You might be looking at the query logs on the wrong machine.
-- In the absence of a "@" on the command line, "dig" will, likewise, 
use /etc/resolv.conf to find resolvers to use, so, again, see if there 
is something unexpected in the contents of that file.
-- Performing a "dig" with "@10.32.5.2" directs a query for that name 
directly to 10.32.5.2, so /etc/resolv.conf is irrelevant in that case.
-- Note that "dig" without any parameters generates a query for the root 
zone, which is somewhat special. How is named configured, such that it 
can resolve root-zone queries in your environment? Hints? Forwarding? 
Slave? I'm not sure that querying the root zone is really a recommended 
way to troubleshoot resolution in general, since there are some special 
characteristics of the root zone which do not apply to other zones.

If you _can_ narrow this down to some sort of intermittent timeout 
problems for a specific resolver trying to resolve specific names, then 
what I'd do next to try and troubleshoot this issue, is mimic the action 
of the iterative resolver, from the same box (and using the same source 
address(es)/port(s) if you've locked those down), working down from the 
root zone to each level of the delegation hierarchy and following 
referrals. Since you say this is intermittent, I'd do it several times 
and see if I get any timeouts along the way. If you're not familiar with 
the nuances of iterative resolution, the "+trace" option of dig provides 
a reasonable facsimile.

Not being a Gentoo user, I don't know what "emerge" is, but I assume 
it's some sort of auto-update function (?) If it's as much of a 
bandwidth hog as most such utilities, possibly it might have been 
saturating your Internet link -- since DNS lookups use UDP, they tend to 
be disproportionately affected by link saturation.

                                                                         
                                       - Kevin



More information about the bind-users mailing list