recursive queries fail with high load?

Sotiris Tsimbonis tsimbonis at forthnet.gr
Mon Feb 26 15:56:55 UTC 2007


Chris Michels wrote:
> Sotiris Tsimbonis wrote:
>> Unfortunately, we seem to face the same problem with bind 9.3.3. After 
>> 2-3 days of uptime, for no apparent reason, all answers take too long 
>> and usually timeout.
>>   
> You say all answers take too long and usually timeout.  Does that
> include queries for which your are authoritative?  For us only recursive
> queries are taking a long time.

This server does not have any real zones for authoritative answers (20 
zones shown in rndc status are in internal view only). All queries 
received and answered are recursive (restricted with allow-query and 
allow-recursion options to our customers only).

>> # rndc status
>> recursive clients: 25/10000
>>   
> Our recursive clients is much higher which also makes me think this may
> not be the same problem.

The above rndc status was taken at the time of the problem, that is why 
it is so low.
At 'normal' operation times we can get 500 recursive clients or more, 
but we now avoid sending so much traffic to this ns, until we fix the 
problem..
At the moment we get

# rndc status
recursive clients: 238/10000

> Note that when it is really bad increasing the timeout of the query
> doesn't help and dig gets a SERVFAIL response.  Sometimes we get
> SERVFAIL even with a short timeout like this:

ditto..

Sotiris.



More information about the bind-users mailing list