DNS dying

Jim Reid jim at rfc1035.com
Tue Oct 3 22:02:10 UTC 2000


>>>>> "Guillermo" == Guillermo Villasana Cardoza <terius at villasana.com.mx> writes:

    Guillermo> I know the -- MARK -- is from syslog... but after the
    Guillermo> last query made... the dns died. The times it has died
    Guillermo> the last query is a points to a CNAME or a Lame server
    Guillermo> error...

The logs you provided do not really give any proof of what you said is
happening. They tell us that your server found a mangled MX record at
07:03:31 on Oct 3rd, but they do not show when or even if the name
server died. And there's no indication from the logs you showed that
your name server getting a query or reply that caused it to fall
over. Or in fact any problem that caused a catastophic failure.

The name server is unlikely to die because of a lame delegation or a
CNAME target of an MX record. These are *very* common errors and if
they caused name servers to die, the Internet would be a very
different place because huge numbers of name servers would be
continually falling over. OTOH if these configuration errors did make
name servers die, maybe they wouldn't occur so often?

    Guillermo> How can I see what is making it really die?

The name server should print a message in the system logs if it
encounters a fatal error: like running out of memory or being unable
to set up sockets on port 53. Check all your system log files. What
version of BIND are you running and what OS is it running on? Maybe
you're running a version that's got a security hole and someone is
exploiting that hole? If you think that your server is being attacked,
you could turn on query logging and see for sure what the last query
was before the server died. As a last resort, you could also run the
name server with debugging turned up high and wade through the
megaybtes of trace/debug messages.

You also said that the name server was working fine until a few days
ago. What has changed since the server ran normally? Could you have
applied a patch that changed or zapped a shared system library? The
most recent change(s) to your system will be the most likely
explanation for the problem.



More information about the bind-users mailing list