BIND responding slow or not at all....
Kevin Darcy
kcd at chrysler.com
Tue Jul 15 03:30:21 UTC 2008
Johan Louwers wrote:
> Hi,
> I have currently a problem with my 2 bind servers on gentoo.
> Pubmai01 is BIND 9.3.2_p1
> Pubmai02 is BIND 9.3.3
>
>
> When I do a local dig to a local address I get the response directly
> pubmai02# dig someserver.mydomain.com
>
> using the IP adres of the name server I am logged in to is also
> working and I get the answer directly
> pubmai02# dig @10.32.5.2 someserver.mydomain.com
>
> using only dig gives after some time
> pubmai02# dig
> ; <<>> Dig 9.2.3 <<>>
> ;; global options: printcmd
> ;; connection timed out; no servers could be reached
> pubmai02#
>
> when I query for a external domain, lets say google.com most of the
> times I do get the same result as when I execute only dig. Sometime
> however I do get a result, it is quite random. However every time I do
> a request this way it do’s not matter if I get a result or not, I do
> get a entry in the query log.
> pubmai02# dig @10.32.5.2 google.com
>
> I can do all those things on the local machine and I can also do it
> from the other name server… same result… and always a entrie in the
> query log. However, when I try it from an other network segment like
> for example a machine with IP 10.32.2.19 I do not get a result and I
> do not see a entry in the query log. I can however access the machine
> by telnet/ssh/ftp/ping…. Etc etc….
>
> The strange thing is that those problems where there as from Saturday
> morning 7.07 suddenly and we did not have had any changes in the
> network infrastructure or on the bind servers as far as we know. We
> did however notice that emerge is scheduled in crontab by a former
> system engineer from what we did not notice until now…
> Do’s anyone can give us a clue on what this might be causing and how
> we can solve the problems?
>
Your troubleshooting steps so far are rather disjointed.
-- Command-line utilities such as ssh/ftp/telnet/ping may or may not use
DNS to resolve their name queries, or they may use other sources of
naming information (e.g. /etc/hosts, NIS, etc.) *before* consulting DNS.
On Gentoo, I believe this is controlled (as it is also on Solaris) by
the nsswitch.conf file. In the case where a client resolves a name and
the query doesn't show up in your query logs, see if it got the
resolution through some other means besides DNS. If that checks out,
then check out the resolvers listed as "nameserver"s in the client's
/etc/resolv.conf. Is 10.32.5.2 first in the list? If not, then maybe
some other resolver gave the answer before 10.32.5.2 was even queried.
You might be looking at the query logs on the wrong machine.
-- In the absence of a "@" on the command line, "dig" will, likewise,
use /etc/resolv.conf to find resolvers to use, so, again, see if there
is something unexpected in the contents of that file.
-- Performing a "dig" with "@10.32.5.2" directs a query for that name
directly to 10.32.5.2, so /etc/resolv.conf is irrelevant in that case.
-- Note that "dig" without any parameters generates a query for the root
zone, which is somewhat special. How is named configured, such that it
can resolve root-zone queries in your environment? Hints? Forwarding?
Slave? I'm not sure that querying the root zone is really a recommended
way to troubleshoot resolution in general, since there are some special
characteristics of the root zone which do not apply to other zones.
If you _can_ narrow this down to some sort of intermittent timeout
problems for a specific resolver trying to resolve specific names, then
what I'd do next to try and troubleshoot this issue, is mimic the action
of the iterative resolver, from the same box (and using the same source
address(es)/port(s) if you've locked those down), working down from the
root zone to each level of the delegation hierarchy and following
referrals. Since you say this is intermittent, I'd do it several times
and see if I get any timeouts along the way. If you're not familiar with
the nuances of iterative resolution, the "+trace" option of dig provides
a reasonable facsimile.
Not being a Gentoo user, I don't know what "emerge" is, but I assume
it's some sort of auto-update function (?) If it's as much of a
bandwidth hog as most such utilities, possibly it might have been
saturating your Internet link -- since DNS lookups use UDP, they tend to
be disproportionately affected by link saturation.
- Kevin
More information about the bind-users
mailing list