Timeouts during cache cleaning and zone collection

Nod none at nospam.none
Tue Jun 21 16:29:47 UTC 2005


On Mon, 20 Jun 2005 11:05:06 +0200, "Auer, Karl James" <karl.auer at id.ethz.ch>
wrote:

>Hi there.
>
>We are seeing a problem with BIND 9.3.0, compiled with threading on
>Solaris, whereby the servers stop answering queries for a couple of
>seconds. Qeuries in this interval time out. That is, they are not
>answered slowly, they are not answered at all.
>
>The servers do this a) when they clean their caches and b) when they are
>downloading zones.
>
>Archived messages on the matter of cache cleaning suggest that these
>timeouts are normal for BIND, and that the only way to avoid them is to
>set turn cache cleaning off. I've tried setting the cleaning interval to
>only a few minutes, but it just caused more timeouts - there seems to be
>a sort of minimum interruption due to cache cleaning.
>
<snip>

I'm having a similar issue to this, namely the cache cleaning.
It seems that whenever the cache-cleaner runs, it blocks requests to the
nameserver. While the secondary cache will take over, most of the clients will
see a 5-6 second delay (Windows) before using the secondary server. Setting the
cache cleaner to run once a minute takes about 7-8 seconds to clean. Leaving it
at an hour means it will be cleaning for about 20 minutes.
Turning off the cleaner results in bind using up all the ram and getting killed
by the OS.

The server in question serves about 11000 zones. Dual xeon 2.8, 4gb ram, FreeBSD
5.4 64 bit, MAXDSIZ, MAXSSIZ, and DFLDSIZ set to 2gb at the moment. Generally,
about 800-900 recursive clients at any given time.

If anyone has any ideas how to improve the cache cleaner performance, it would
be most appreciated. As it is now, the nameserver 'going away' happens far too
often, and is very noticable.



More information about the bind-users mailing list