BIND freezing up randomly under "real" load
Ian_Veach at nshe.nevada.edu
Ian_Veach at nshe.nevada.edu
Thu Aug 4 18:44:58 UTC 2011
Am (was) prepping to deploy BIND 9.7.3-P3 (which is the version that came
with RHEL6.1) on RHEL6.1, sitting on top of OSPF anycast. Currently
running BIND 9.5.0-P2 (with Novell patches) on SLES 11 (with OSPF anycast)
just fine in production, but running into strange problem on new system,
not encountered during testing.
SLES system runs ok, RHEL install tested ok under "loading." When moving
into production, named runs fine with no customers (ospf off). I can (and
have) queried it (dig @localhost) all day. When I turn on ospf and let
the real world query, it works fine for a couple minutes and then hangs
completely: rndc won't connect, no logs are updated, named will not
respond at all, but it is still running (and a telnet to 53 or 953
connects at least, but I'm not familiar with protocol level commands).
named stop fails (hangs), and I have to pkill it to be able to restart it.
A trace (level 99) yielded nothing in the logs before it crapped out. I
was running rndc and I see those and then nothing. The few previous
errors noted were DNS format errors from external sources.
Network guys and I note OSPF looks fine (digging to localhost, anyway, and
that eventually fails). lsof doesn't show anything particularly weird.
system load, mem, etc. seems fine (comparable to SLES systems). It just
seems to give up the ghost...
Any ideas? Any additional info that would be helpful?
cheers and thanks,
________________________________________________________________________
Ian Veach NSHE System Computing Services
ian_veach at nshe.nevada.edu Senior Systems Engineer
________________________________________________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20110804/062d4fd4/attachment.html>
More information about the bind-users
mailing list