Timeouts during cache cleaning and zone collection

Wed Jun 22 23:40:57 UTC 2005

On Thu, 23 Jun 2005 03:55:41 +0900, JINMEI Tatuya / $B?@L at C#:H(B
<jinmei at isl.rdc.toshiba.co.jp> wrote:

>>>>>> On Wed, 22 Jun 2005 03:28:13 GMT, 
>>>>>> none at nospam.none (Nod) said:
>
>> Just upgraded from 9.3.1rc1 to 9.3.1. Unfortunately, the cache cleaning still
>> results in no answer from the nameserver until it's done.
>
>Hmm...just to be sure, did you also enable
>ISC_MEM_USE_INTERNAL_MALLOC? (I'm dubious about whether this takes a
>dominant role for this particular issue, though).
No, we didn't use this particular option.

>
>Also, didn't you get *any* responses during the cache cleaning?  Or
>did you still see some responses (as well as no-answers)?  In my
>understanding, named should still be able to respond to queries even
>during the cache cleaning while it can drop some of the queries due to
>the additional cleaning task.
The server would respond to an 'rndc status', and showed about 30-40 recursive
clients. For a server normally serving 600+, I'd consider this to be
non-responsive. If it was answering DNS requests, I couldn't tell.

>
>If you can see some responses, one additional possibility of tuning is
>to reduce the load of each 'chunk' of the whole cleaning work.  You
>can do this by modifying the DNS_CACHE_CLEANERINCREMENT macro at line
>48 of bind-9.3.1/lib/dns/cache.c:
>
>#define DNS_CACHE_CLEANERINCREMENT	1000	/* Number of nodes. */
>
>I hear a report that changing this value to 200 could eliminate the
>packet loss in some environment.  (Unfortunately, there is no other
>way than modifying the source code to tune this value at the moment).
I'll look into it, however it seems like the cleaning method itself is
intrinsically flawed. As I see a cache, once new data comes into a full cache,
the old data should get 'pushed' out, instead of a period (garbage?) cleanup. If
I'm misunderstanding this, feel free to correct.

>
>> Operating system choices aside, is there any reason to expect that moving to a
>> threaded nameserver would help overcome this issue?
>
>Yes, at least in theory, because with multiple threads (on multiple
>processors) additional threads can continue accepting (and responding
>to) queries while the other thread is cleaning the cache.  However,
>I'd still not recommend to enable threads for the combination of
>FreeBSD and BIND9.  The overhead for the threads in this combination
>is so bad, so I suspect the possible benefit does not outweigh the
>performance penalty.
>
>For other OSes (with multiple processors), it may be possible to
>mitigate this problem with multiple threads.
>
>					JINMEI, Tatuya
>					Communication Platform Lab.
>					Corporate R&D Center, Toshiba Corp.
>					jinmei at isl.rdc.toshiba.co.jp
Linux is a possability, but I'd hesitate to make such a trastic change for a
problem that's going to be the same. Multiple threads or not, the cache-cleaning
method just seems inefficient.