BIND 9.4.3b2 crash

Mark Andrews Mark_Andrews at isc.org
Tue Jul 22 23:28:35 UTC 2008


> On Wed, 2008-07-23 at 00:26 +1000, Mark Andrews wrote:
> > > Hi,
> > > 
> > > I have several authoritative and recursive combined servers running bind
> > > 9.4.3b2 that all crash in the same manor.  They will normally run for a
> > > day or two between crashes, however.  The syslog messages only show:
> > > 
> > > Jul 21 12:08:32 named[17892]: socket.c:1710: fatal error:
> > > Jul 21 12:08:32 named[17892]:
> > > RUNTIME_CHECK(((pthread_mutex_destroy(((&sock->lock))) == 0) ? 0 : 34)
> > > == 0) failed
> > > Jul 21 12:08:32 named[17892]: exiting (due to fatal error in library)
> > > 
> > > but elsewhere in the log it has these messages (don't result in named
> > > exiting):
> > > 
> > > Jul 21 10:53:39 named[17892]: socket.c:525: unexpected error:
> > > Jul 21 10:53:39 named[17892]: epoll_ctl(DEL), 73: Bad file descriptor
> > > Jul 21 10:53:39 named[17892]: socket.c:525: unexpected error:
> > > Jul 21 10:53:39 named[17892]: epoll_ctl(DEL), 69: Bad file descriptor
> > > 
> > > 9.4.2 was quite a bit more stable (although I won't say it never exited,
> > > different errors though IIRC) on the same hardware / OS patch level.
> > 
> > 	The b2 in BIND 9.4.3b2 stands for BETA 2.  It's not a final
> > 	release and has lots of new code in it to support many more
> > 	concurrent file descriptors as well as getting back some
> > 	of the performance loss in the BIND 9.4.2-P1 compared with
> > 	BIND 9.4.2.  If we had our druthers, we would have only
> > 	introduced a change like this in a .0 release.
> > 
> > 	The changes for BIND 9.4.2-P1 are much more conservative
> > 	than those for BIND 9.4.3b2.
> 	
> Thanks, I just didn't know if 9.4.2-P1 would handle the load after
> reading the warnings on higher CPU usage. Some of our traffic spikes
> push the loadavg past 2.00 even with 9.4.3b2.  Also I wanted to do my
> part in getting beta2 tested.  
> 
> 
> > > Both servers average 50-100 recursive clients, although I've seen spikes
> > > to 2000+ (which I imagine is malicious traffic or bots trying to exploit
> > > the recent vulnerability that I upgraded to fix).
> > > 
> > > $ uname -a
> > > Linux 2.6.17-ARCH #1 SMP PREEMPT Sun Jul 16 09:29:38 CEST 2006 i686
> > > Intel(R) Pentium(R) D CPU 3.00GHz GenuineIntel GNU/Linux
> > > 
> > > $ gcc -v
> > > Using built-in specs.
> > > Target: i686-pc-linux-gnu
> > > Configured with: ../gcc-4.1.1/configure --prefix=/usr --enable-shared
> > > --enable-languages=c,c++,objc --enable-threads=posix
> > > --enable-__cxa_atexit --disable-multilib --libdir=/usr/lib
> > > --enable-clocale=gnu
> > > Thread model: posix
> > > gcc version 4.1.1
> > > 
> > > Anything I can try to stop these crashes?
> > 
> > 	Stack backtraces are useful to isolate the problem code.
> > 	Backing down to BIND 9.4.2-P1 is a option if the performance
> > 	hit is not to big.
> 
> 
> Is there a FAQ or other documentation on how these should be obtained?

	gdb /usr/local/sbin/named named.core
	thread apply all bt full
 
> Thanks,
> 
> Dale
> 
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: Mark_Andrews at isc.org


More information about the bind-users mailing list