URGENT, PLEASE READ: 9.5.0-P1 now available

Emery Rudolph emery.rudolph at gmail.com
Tue Jul 29 01:27:16 UTC 2008


"It's strange that a server handling such a moderate level of queries
consumes all available file descriptors.  One possible reason that can
cause that is that the server has a reachability problem (whether it's
in the server itself, in the link to the Internet, or in the
authoritative servers that it often sends queries to) many queries
result in timeouts.  It might be helpful if you can check this point
(by capturing packets, etc)."
------------------------------------------------------------------------------------------------------


This is not the case.

My two nameservers have been operating for more than two years and are
thoroughly monitored. I can tell you with all confidence, proof and
definitive resolve that neither servers cpu has exceeded 5% utilization in
all of that time. Upon upgrading the secondary server, the cpu now hovers
between 60% -> 80%. The problem is strictly the BIND code.

Was the code changed so that named opens a pool of random UDP ports for use
in answering these queries?

Emery Rudolph.

On Mon, Jul 28, 2008 at 7:18 PM, JINMEI Tatuya / $B?@L at C#:H(B <Jinmei_Tatuya at isc.org
> wrote:

> At Thu, 24 Jul 2008 22:23:26 -0400,
> Emery <atlantic at comcast.net> wrote:
>
> > Thank you for your response. I will await further details on newer
> > versions that more fully address this issue.
> >
> > By failure I am referring to the random crashes others have reported
> > with other errors such as "bad file handle" errors as opposed to the
> > "too many open" error.
>
> random crashes (process termination, often with core-dumping) and the
> 'too many open' error are different type of problems.  there may be
> exceptions, but my general understanding is:
>
> - the former mainly happens on beta versions with threads
> - the latter mainly happens on P1s
>
> And, as you probably guessed, 'too may open' errors are not
> necessarily fatal (though noisy and annoying) like a crash.  But it
> will also cause more failure results (such as SERVFAIL error returned
> to clients), so I'm afraid it's actually unacceptable if this error
> regularly happens.
>
> > I guess I'm looking for feedback on what you have seen in reference to a
> > crashes and the too many open errors. Is there any pattern to the type
> > of system (resource availability) and crashes. The nameservers I am
> > running handle about 1100 queries a minute, but only offer recursion to
> > internal clients.
>
> It's strange that a server handling such a moderate level of queries
> consumes all available file descriptors.  One possible reason that can
> cause that is that the server has a reachability problem (whether it's
> in the server itself, in the link to the Internet, or in the
> authoritative servers that it often sends queries to) many queries
> result in timeouts.  It might be helpful if you can check this point
> (by capturing packets, etc).
>
> ---
> JINMEI, Tatuya
> Internet Systems Consortium, Inc.
>




More information about the bind-users mailing list