2 simultaneous hung Bind boxes

Nikkilä, Tommi tommi.nikkila at logica.com
Wed Oct 28 07:30:33 UTC 2009


Hi!

On some of our (linux based) DNS server's the BIND just hangs; the combination was fairly old hardware and fairly new OS/BIND. Couldn't figure it out either until I came up with https://www.isc.org/node/302.

At least you could try it, I found no harm on setting the /proc/sys/net/core/xfrm_larval_drop to 1 just to be on the safe side...

--
Tommi Nikkilä
System Specialist

-----Original Message-----
From: bind-users-bounces at lists.isc.org [mailto:bind-users-bounces at lists.isc.org] On Behalf Of Justin Shore
Sent: 28. lokakuuta 2009 7:30
To: bind-users at lists.isc.org
Subject: 2 simultaneous hung Bind boxes

I got a call from a remote tech earlier this evening.  He was at home on our service and couldn't get on the Internet.  His IP connectivity was fine and could hit my NOC website via IP only.  DNS however was hosed. 
About the time I got in a position to check the bind logs and sniff his traffic the problem went away.  We chocked it up to a local problem until a few minutes later across the SP network I too experienced the same thing.  My DNS requests simply timed out.  I turned on querylog on our boxes and could see what appeared to be successful hits and replies. 
  The boxes were just not replying to queries.  Traffic on our main upstream dropped by about 90% within a few short minutes (users' DNS stopped and outbound usage ground to a halt basically).  Not knowing what else to try I restart bind on both NSs.  That fixed it.

The boxes are running fairly old Bind code, 9.5.1b2.  Tomorrow I will upgrade to 9.6.1rc1 (unless people believe 9.7.0b1 is ready for use). 
My question is are there any known ways for a crafted query or crafted reply to cause what I've described on that old release of Bind?  I recall hearing about assorted things over the past couple of years though I thought that they were things that would cause actual crashing, not the mentally hosing my boxes appeared to take this evening.  Does anything else come to mind?  The views on the servers only permit recursive lookups internally from our customer prefixes.  Externally you can only get responses for things that we have authority over.  Thoughts?

Thanks
  Justin
_______________________________________________
bind-users mailing list
bind-users at lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users




More information about the bind-users mailing list