bind-9.5.0b1 problem on ppc64 : rbtdb.c:1532: REQUIRE(prev > 0) failed
Fajar A. Nugraha
fajar at fajar.net
Tue Feb 5 02:10:41 UTC 2008
Hi,
I'm using bind on RHEL5/ppc, on a somewhat busy nameserver. I'm building
my own RPM based on Fedora 9 (dev) SRPM, bind-9.5.0-24.b1.
First I build 32 bit rpm (ppc), which works fine on older ppc machine
(power4, CPU, 1GHz). This machine doesn't have the processing power
required for our needs. To give an idea of the workload, "rndc status"
says "recursive clients" usually over 2000. So we move to newer hardware
(power5+, 8 logical CPUs, 1.6 GHz). This is when the problem begins.
Using the same 32bit binary named process would sometime freeze with no
apparent reason. "ps" says its running, but named doesn't response to
any queries. Using 64bit binary helps a little, since instead of
freezing, now it stops with this error:
rbtdb.c:1532: REQUIRE(prev > 0) failed
exiting (due to assertion failure)
digging into the source code, the corresponding code seems to be :
./lib/dns/rtdb.c
dns_rbtnode_refdecrement(node, &nrefs);
which leads to ./lib/dns/include/dns/rbt.h
isc_refcount_decrement(&(node)->references, (refs)); \
which leads to ./lib/isc/include/isc/refcount.h
prev = isc_atomic_xadd(&(rp)->refs, -1); \
REQUIRE(prev > 0); \
which leads to ./lib/isc/powerpc/include/isc/atomic.h, that contains
ppc-specific asm. Way out of my league :-P
So here's the question :
- Why did it work correctly on my older machine, but not on this newer
machine?
- Is the new machine perhaps too fast, or have too many logical CPUs?
- Why did the ppc and ppc64 binary behaves differently? (I like the
ppc64 behavior better though)
Regards,
Fajar
More information about the bind-users
mailing list