named taking up 100% resource on only 1 CPU on a multi CPU sy stem

Kuhtz, Christian christian.kuhtz at BellSouth.com
Sat Jan 11 05:18:34 UTC 2003



While I agree that BIND9 isn't as quick as BIND8, the solution for running
BIND8 on multiproc is IMHO ugly enough operationally for some of us to cause
us to want to bite the bullet of BIND9 now, and hope for performance
improvements as we move forward.

I consider it more of an issue of perspective.  If you have long term focus,
and for a variety of reasons you may want to be BIND9 on a multiproc box, it
can very well be worth the trade-off in performance to make a strategic
decision and end up with an overall cleaner setup..  (btw, there's also some
worthwhile benchmarking one might want to do to see just how far you want to
push the multiproc setup even with BIND9 and when the point of diminishing
returns is reached).

YMMV, of course.

Doing a trace on an already busy system can be a really bad idea as well.
If at all possible, one should try to simulate in a pre-production
environment..  (queryperf in the contrib section of the BIND distribution
can help get your box busy, or you may resort to other tools).  Or perhaps
throw a sniffer into the mix to see what's inbound to the server (perhaps
you're suffering secondary effects from garbled network traffic).

Looking at named.stats may be a good idea, too, to get a handle on how named
sees the world and get a glimpse as to what it is up to.

IMHO, it's relatively easy to overcook a single CPU in, say, a service
provider environment; depending on your specific architecture, of course.
So, there may not be anything "wrong" per se with the fact that he's frying
a single CPU.  Hard to tell with the information given so far.

Thanks,
Christian

-----Original Message-----
From: Rick Jones [mailto:foo at bar.baz.invalid]
Sent: Friday, January 10, 2003 5:07 PM
To: comp-protocols-dns-bind at isc.org
Subject: Re: named taking up 100% resource on only 1 cpu on a multi cpu
system


Simon Waters <Simon at wretched.demon.co.uk> wrote:
> Elias wrote:
>> Anybody got a clever solution for my problem?
> Run BIND 9.

It might be a little more complicated than that. Especially given that
BIND 9 may not be as fast as BIND 8 to begin with.
(ftp://ftp.cup.hp.com/dist/networking/briefs/)

> BIND 8 is single threaded so to scale you'd have to run multiple
> copies, BIND 9 just generates threads based on how many CPU's
> you have.

> Do you understand why it is suddenly busy at this point, as
> throwing more CPU's at the problem may just result in using all
> your CPU's rather than just the one, although I'd recommend BIND
> 9 anyway.

If the system has stopped responding to queries and the named is
consuming 100% of a CPU, then it does indeed seem like a good idea to
see what the named is doing.  Start with a system call trace, then a
library call trace, then perhaps a profile if there is something like
prospect for Solaris.

rick jones
-- 
firebug n, the idiot who tosses a lit cigarette out his car window
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to raj in cup.hp.com  but NOT BOTH...


*****
"The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential, proprietary, and/or
privileged material. Any review, retransmission, dissemination or other use
of, or taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited. If you received
this in error, please contact the sender and delete the material from all
computers."


More information about the bind-users mailing list