AW: AW: Re[2]: Host domain.dom not found: 2(SERVFAIL)

Walkenhorst, Benjamin Benjamin.Walkenhorst at telekom.de
Thu Nov 4 15:13:17 UTC 2004


Hello,

> Now it's working ok. I've restarted bind.
> But i'm sure that after some period of time
> this problem will appear again.
> And i'm also sure that when i restart just only bind -
> problem disappears.

Okay, I got you wrong there... Sorry... 

> May be when this problem will appear again i will let you know.
> Or may be you can give me some advice, for examle - to turn on some
> another logging level (now i have "severity info") or something else.
> I think it's a promlem of the named core or problem of the OS (FreeBSD
> 5.2.1). But IMHO  - it's a bind problem. I have plenty of memory and
> disk space. All other daemons work without problems.

While I did not have any bad experiences with FreeBSD 5.2.1, AFAIK it's not
recommended for production systems. 5.3 probably will be, so you might try giving
your machine an update.
However, I don't think that explains the problems you are facing.
Does your nameserver stop reacting altogether? Is it still reachable via rndc?
What does 'rndc status' say? Do recursive queries still work?
Do you have to give the server a 'hard restart' (/etc/rc.d/named restart) or does
'rndc reload' or 'rndc reconfig' work?

The first thing I'd do is crank up the debugging level. Something between 5 and 10 will give
you plenty of output. And if it doesn't, you can go as far as 99. =)
Also, you might want to explicitly log queries if you are not already doing so.
If you want to watch even more closely you can start BIND manually and redirect all of its
output to STDOUT/STDERR with the '-g' option (Note this disables logging to files *completely* - 
you might want to pipe the output through tee or some pager of your choice. 

Another thing I found useful when debugging network problems is a packet sniffer.
It will enable you to see what's actually going on the wire.
Also, you might want to monitor your server closely (for system load, memory usage, network bandwidth
usage and so on) to see if anything unusual is happening.

You might want to look at interface-interval. I can't imagine how this might cause your
problem, but weirder things have happened:
BIND will look every ${interface_interval} minutes if the system's network interfaces
have changed. It will then try to detach from IFs that have gone and bind to new ones.
If BIND is running without root-privileges (as recommended), it won't be able to bind
to a new interface without restarting. Like I said, this shouldn't cause the kind of 
trouble you are facing, but who knows?
Also, if BIND is running in a chroot, make sure that the 'directory' entry in your
config refers to a path relative to BIND's root-directory. 

I suggest: Raise the debugging level to 10 and see if anything useful shows up. By default,
if BIND is running with a debugging level > 0, debugging info will be written to named.run.
NOTE: a debugging level of 10 will give you a *HUGE* output file; however, you can find
interesting entries by category and/or time. Also, you should log queries to an individual
file for testing purposes. 
If that doesn't help, a packet sniffer will possibly help you a lot; I consider a packet
sniffer one of the most valuable tools available for debugging network problems (especially
of the really weird kind). tcpdump should come with FreeBSD. If not, it's available from
ports.

Finally, if nothing helps: FreeBSD 5.3 is due to be released this weekend; it's going to be the
first 5.x-release that will be recommended for production use. Are you using a custom kernel
or a GENERIC one? You might also want to scan through /var/log/messages; maybe something will 
show up that looks entirely unrelated at first but is causing your problem.


I hope any of the above will be helpful to you.
Kind regards and good luck,
Benjamin



More information about the bind-users mailing list