URGENT, PLEASE READ: 9.5.0-P1 now available

JINMEI Tatuya / 神明達哉 Jinmei_Tatuya at isc.org
Mon Jul 28 23:18:55 UTC 2008


At Thu, 24 Jul 2008 22:23:26 -0400,
Emery <atlantic at comcast.net> wrote:

> Thank you for your response. I will await further details on newer 
> versions that more fully address this issue.
> 
> By failure I am referring to the random crashes others have reported 
> with other errors such as "bad file handle" errors as opposed to the 
> "too many open" error.

random crashes (process termination, often with core-dumping) and the
'too many open' error are different type of problems.  there may be
exceptions, but my general understanding is:

- the former mainly happens on beta versions with threads
- the latter mainly happens on P1s

And, as you probably guessed, 'too may open' errors are not
necessarily fatal (though noisy and annoying) like a crash.  But it
will also cause more failure results (such as SERVFAIL error returned
to clients), so I'm afraid it's actually unacceptable if this error
regularly happens.

> I guess I'm looking for feedback on what you have seen in reference to a 
> crashes and the too many open errors. Is there any pattern to the type 
> of system (resource availability) and crashes. The nameservers I am 
> running handle about 1100 queries a minute, but only offer recursion to 
> internal clients.

It's strange that a server handling such a moderate level of queries
consumes all available file descriptors.  One possible reason that can
cause that is that the server has a reachability problem (whether it's
in the server itself, in the link to the Internet, or in the
authoritative servers that it often sends queries to) many queries
result in timeouts.  It might be helpful if you can check this point
(by capturing packets, etc).

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.


More information about the bind-users mailing list