Bind9 Crazy-high CPU on Linux
Mark Andrews
Mark_Andrews at isc.org
Fri Jan 19 01:58:28 UTC 2007
> Thank you for all the references and help.
>
> I upgraded to 9.4rc1 with the following results:
>
> Massive jump in memory usage (about double). The named process now shows a
> memory footprint of close to 900MB where before 500MB would kill it.
>
> CPU stays between 20-25% and spikes to 30-35% during a cleaning interval
> which lasts only a minute or two.
>
> Previously with 9.3.2, rndc status showed upwards of 2,000 or more recursive
> clients. Now it shows only less than 500 at any given time and the output
> format has changed:
>
> recursive clients: 463/3900/4000
>
> The server has been up a little over 36 hours.
>
> I also noted three new items in the named.stats file. "Duplicate" and
> "dropped" are new values. Does anyone know how to fit them in to the
> greater scheme? For example recursion can be subtracted from a combined
> total of success, referral, nxrrset, nxdomain and failure to generate a
> percentage. Where do the new values fit?
>
> success 30428464
> referral 2099872
> nxrrset 6270659
> nxdomain 16121686
> recursion 29924813
> failure 8892309
> duplicate 1101621
Duplicate queries are ones where a indentical query was
recieved (source address and port, qname, qtype, qclass)
while a existing query was being resolved.
> dropped 159814
Excessive recursive queries for <qname, qtype, qclass> other
than duplicate queries. Excessive is self adjusting within
10:100 queries. Successful resolution of a query for which
there were drops raises the number of simultanious recursive
clients for a given <qname, qtype, qclass> tuple (provided
another susccesful query has not already raised the threshhold).
This is then decaded with a timer.
From the above figures, it looks like your client base is
making lots of queries which fail to resolve.
> -Matt
>
>
> > -----Original Message-----
> > From: bind-users-bounce at isc.org
> > [mailto:bind-users-bounce at isc.org] On Behalf Of Stefan Puiu
> > Sent: Tuesday, January 16, 2007 8:13 AM
> > To: Schlosser, Matt D.
> > Cc: bind-users at isc.org
> > Subject: Re: Bind9 Crazy-high CPU on Linux
> >
> > Hi,
> >
> > On 1/15/07, Schlosser, Matt D. <mschlosser at eschelon.com> wrote:
> > > The machines run between 800 and 1,000 queries/second for both
> > > authoritative and recursive zones. After 12-24 hours, the CPU will
> > > spike to 100% and sit there while the machine times out any more
> > > queries. The only resolution is to restart bind.
> >
> > I haven't personally experienced this, but I've seen it reported quite
> > a few times on this list. IIRC, it's been reported that the cache
> > cleaning can be quite heavy sometimes, so you might want to adjust the
> > cleaning interval.
> >
> > Also, recompiling BIND with internal malloc support was reported to
> > help (this requires editing a header file IIRC). That part seems to be
> > detailed here:
> >
> > http://groups.google.com/group/comp.protocols.dns.bind/browse_
> > thread/thread/c830e65e2247c630/bfe3178894e98351?lnk=gst&q=jinm
> > ei+internal+malloc&rnum=1#bfe3178894e98351
> >
> > No idea why running on Windows would make a difference.
> >
> > Look in the archives, I believe it's been quite well covered.
> >
> > Stefan.
> >
> >
>
>
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: Mark_Andrews at isc.org
More information about the bind-users
mailing list