Recursion ceases for 5-10 minutes at random intervals throughout the day
JINMEI Tatuya / 神明達哉
Jinmei_Tatuya at isc.org
Fri Feb 15 04:00:39 UTC 2008
At Wed, 13 Feb 2008 17:32:41 -0500,
Bill Springall <springall at fuse.net> wrote:
> Each server handles anywhere between 500-1500 qps throughout the
> day, under normal load. Problem occurs at all loads.
> I've tried port, "monitoring", tcpdumping the traffic, and sifting
> through the requests and nothing seems out of the ordinary. Numerous
> tweaks of the OS have not helped (state table within limits and then
> disabled, firewall deactivated/activated, eth stats good). When the
> problems happens I can get onto the machine and it is ok (network
> upstream good, routing table hasn't inherited anything new, server calm)
> When I turn logging up to a level that can help, named can't keep up.
> We are now have a troubleshooting process in the works that
> involves different hardware and 9.4.2, environment re-architecture, as
> well as, <shiver>, other caching dns software.
> Is there a known problem, that I haven't been able to find, that
> could be causing this? As I understand the, "Server Failure", message
> is a general message, could someone help to point me to the next thing
> to try? Any help would be appreciated!
I cannot think of a reason, but please let me ask something first.
- according to your description, the queries were not dropped, but
were simply responded with server failure, right?
- how much of memory does named use when this occurs?
- how busy (in terms of CPU utilization) is named when this occurs?
- does this change if you disable threads?
Thanks,
---
JINMEI, Tatuya
Internet Systems Consortium, Inc.
More information about the bind-users
mailing list