bind-9.5.0b1 problem on ppc64 : rbtdb.c:1532: REQUIRE(prev > 0) failed

Fajar A. Nugraha fajar at fajar.net
Tue Feb 5 02:10:41 UTC 2008


Hi,
I'm using bind on RHEL5/ppc, on a somewhat busy nameserver. I'm building 
my own RPM based on Fedora 9 (dev) SRPM, bind-9.5.0-24.b1.

First I build 32 bit rpm (ppc), which works fine on older ppc machine 
(power4, CPU, 1GHz). This machine doesn't have the processing power 
required for our needs. To give an idea of the workload, "rndc status" 
says "recursive clients" usually over 2000. So we move to newer hardware 
(power5+, 8 logical CPUs, 1.6 GHz). This is when the problem begins.

Using the same 32bit binary named process would sometime freeze with no 
apparent reason. "ps" says its running, but named doesn't response to 
any queries. Using 64bit binary helps a little, since instead of 
freezing, now it stops with this error:

rbtdb.c:1532: REQUIRE(prev > 0) failed
exiting (due to assertion failure)

digging into the source code, the corresponding code seems to be :
./lib/dns/rtdb.c
    dns_rbtnode_refdecrement(node, &nrefs);

which leads to ./lib/dns/include/dns/rbt.h
                isc_refcount_decrement(&(node)->references, (refs)); \

which leads to ./lib/isc/include/isc/refcount.h
                prev = isc_atomic_xadd(&(rp)->refs, -1);        \
                REQUIRE(prev > 0);                              \

which leads to ./lib/isc/powerpc/include/isc/atomic.h, that contains 
ppc-specific asm. Way out of my league :-P

So here's the question :
- Why did it work correctly on my older machine, but not on this newer 
machine?
- Is the new machine perhaps too fast, or have too many logical CPUs?
- Why did the ppc and ppc64 binary behaves differently? (I like the 
ppc64 behavior better though)

Regards,

Fajar




More information about the bind-users mailing list