BIND 9.x caching Performance under heavy loads

roy.mongiovi at bellsouth.com roy.mongiovi at bellsouth.com
Mon Mar 7 15:10:10 UTC 2005


We've got a number of caching-only nameservers, some running redhat AS
2.1 and some 3.0.  The AS 2.1 servers are bind 9.2.1 (standard redhat),
although we've tried 9.2.3, and 9.2.2p3.  The AS 3.0 servers are
running bind 9.2.4, also standard redhat.  We're in the process of
upgrading everything to 3.0 in an attempt to have a better platform for
dealing with this problem.

We also see the slowdown after 24 hours operation.  We're currently
keeping things under control with a nightly restart of bind, but
obviously that's not an ideal situation.

We don't restrict cache size.  It's a pretty vanilla caching-only
nameserver as far as bind goes.  We've got four dual processor IBM
x335s with 2.8 gigahertz processors and 8 gig of ram.  A load balancer
distributes incoming requests to 4 of those servers.  Each of those
servers forwards requests to two back-end servers that get to the
internet around the load balancer.  It's built that way because of the
load balancer, but that's another story.

After about 24 hours of operation, the CPU busy starts to gradually
climb on the four front end servers.  At the same time, the back end
servers slow down in processing incoming queries.  Although their CPU
busy doesn't climb, the UDP receive queue for bind fills up.  For some
reason, it just doesn't process requests as fast as it previously had.

External load is about the same each day, and restarting bind clears up
the problem so I think it has to be some sort of bug in bind.  The
really interesting part is that these servers went into operation in
April 2004, and we didn't see any problems until the end of October.
Since October, however, we've had this problem and it seems to be
escalating.

I'm trying the "cleaning-interval 0;" fix now.  We'll know tomorrow if
that helps.  I'm also going to be building a non-thread, non-ipv6
version to see if that can handle our load and if it fixes the problem.



More information about the bind-users mailing list