Timeouts during cache cleaning and zone collection

Wed Jun 22 03:28:13 UTC 2005

On Wed, 22 Jun 2005 06:50:55 +0900, JINMEI Tatuya / $B?@L at C#:H(B
<jinmei at isl.rdc.toshiba.co.jp> wrote:

>>>>>> On Tue, 21 Jun 2005 16:44:22 GMT, 
>>>>>> none at nospam.none (Nod) said:
>
>> I'm having a similar issue to this, namely the cache cleaning.
>> It seems that whenever the cache-cleaner runs, it blocks requests to the
>> nameserver. While the secondary cache will take over, most of the clients will
>> see a 5-6 second delay (Windows) before using the secondary server. Setting the
>> cache cleaner to run once a minute takes about 7-8 seconds to clean. Leaving it
>> at an hour means it will be cleaning for about 20 minutes.
>> Turning off the cleaner results in bind using up all the ram and getting killed
>> by the OS.
>
>> The server in question serves about 11000 zones. Dual xeon 2.8, 4gb ram, FreeBSD
>> 5.4 64 bit, MAXDSIZ, MAXSSIZ, and DFLDSIZ set to 2gb at the moment. Generally,
>> about 800-900 recursive clients at any given time.
>
>> If anyone has any ideas how to improve the cache cleaner performance, it would
>> be most appreciated. As it is now, the nameserver 'going away' happens far too
>> often, and is very noticable.
>
>As I said in a separate message, upgrading to latest versions would be
>a good idea (if you're using an older version), particularly for this
>type of issue.  As also mentioned in the separate message, specifying
>a build-time option for memory management sometimes improves the
>performance.
>
>And finally, if you're enabling threads, I'd suggest to turn it off.
>The current implementation of BIND9 threading is particularly
>unfriendly with FreeBSD's thread support.  It doesn't buy anything but
>poor performance.  (This will be improved in 9.4, but it cannot be a
>short-term solution).
>
>					JINMEI, Tatuya
>					Communication Platform Lab.
>					Corporate R&D Center, Toshiba Corp.
>					jinmei at isl.rdc.toshiba.co.jp
>

Just upgraded from 9.3.1rc1 to 9.3.1. Unfortunately, the cache cleaning still
results in no answer from the nameserver until it's done.
At the moment, my workaround is to set the max-cache size to 16 megs with 1
minute cleaning, and let it chew on the CPU.
Operating system choices aside, is there any reason to expect that moving to a
threaded nameserver would help overcome this issue?