BIND 9.6.1-P1 crashing
Dario Miculinic
dario.miculinic at t-com.hr
Tue Jan 5 07:24:16 UTC 2010
I dont't have the same core dump, but this is from one that happend yesterday:
#0 0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752
752 ttl_sooner(void *v1, void *v2) {
(gdb) where
#0 0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752
#1 0x0819e708 in isc_heap_delete (heap=0xb0f54068, index=1) at heap.c:218
#2 0x080e039f in free_rdataset (rbtdb=0xb0f4f008, mctx=0x864bea0, rdataset=0x59375628) at rbtdb.c:1273
#3 0x080e04c3 in clean_stale_headers (rbtdb=0xb0f4f008, mctx=0x864bea0, top=0x4af6f3e0) at rbtdb.c:1331
#4 0x080e10c4 in decrement_reference (rbtdb=0xb0f4f008, node=0x411b1368, least_serial=0,
nlock=isc_rwlocktype_read, tlock=isc_rwlocktype_none, pruning=isc_boolean_false) at rbtdb.c:1348
#5 0x080ea711 in detachnode (db=0xb0f4f008, targetp=0xb42fe2e4) at rbtdb.c:4877
#6 0x080ea9b1 in rdataset_disassociate (rdataset=0xb05c5a48) at rbtdb.c:7173
#7 0x0812e55a in dns_rdataset_disassociate (rdataset=0xb05c5a48) at rdataset.c:101
#8 0x08132f9e in fctx_destroy (fctx=0xb05c5988) at resolver.c:3081
#9 0x0813548e in fctx_doshutdown (task=0xb0f0ea30, event=0xb05c59e0) at resolver.c:3246
#10 0x081b9221 in run (uap=0xb7f09008) at task.c:862
#11 0x0094c73b in start_thread () from /lib/libpthread.so.0
#12 0x008a1cfe in clone () from /lib/libc.so.6
This is the output of "thread apply all bt full" command (it's quite long):
(gdb) thread apply all bt full
Thread 11 (process 11988):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x007f9367 in sigsuspend () from /lib/libc.so.6
No symbol table info available.
#2 0x081bcc74 in isc_app_run () at app.c:534
event = (isc_event_t *) 0x0
next_event = <value optimized out>
task = (isc_task_t *) 0x0
sset = {__val = {0 <repeats 32 times>}}
strbuf =
"č\220a\b\000\020\005\000\000đ˙˙\000\000\003\000\030\021ńˇy\000\000\000\002\000\000\000ü\023ňˇ\000\000\000\000\000\000\003\000P \032\b at qđˇž\000\000\000\030\000\000\000ô\037\221\000 at 1\221\000Pqđˇ\bR˝ż\207˝\203\000\021\000\000\000\024\000\000\000`\000\000\000Đěa\by.\000\000Pqđˇ8R˝ż\bŻ\031\bČs\001\000\b_ňˇÔ.\000\000č\220a\b\024\000\000"
#3 0x08059f7c in main (argc=0, argv=0xbfbd53c4) at ./main.c:932
result = <value optimized out>
Thread 10 (process 11989):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
---Type <return> to continue, or q <return> to quit---
No symbol table info available.
Thread 9 (process 11990):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 8 (process 11991):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 7 (process 11992):
#0 0x00fe4410 in __kernel_vsyscall ()
---Type <return> to continue, or q <return> to quit---
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 6 (process 11993):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 5 (process 11994):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081b90ae in run (uap=0xb7f09008) at task.c:810
---Type <return> to continue, or q <return> to quit---
No locals.
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 4 (process 11996):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x00953e38 in sendmsg () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081c5962 in doio_send (sock=0xae734710, dev=0xae035788) at socket.c:1630
cc = -1282421656
iov = {{iov_base = 0xafa08cd4, iov_len = 48}, {iov_base = 0xb7f21030, iov_len = 236}, {iov_base = 0x3c0,
iov_len = 2968577464}, {iov_base = 0xb0f0e5c0, iov_len = 2968577472}, {iov_base = 0x94ea47, iov_len = 0}, {
iov_base = 0x861a0b0, iov_len = 2927195036}, {iov_base = 0x2edc, iov_len = 140611816}, {iov_base = 0xec,
iov_len = 0}}
write_count = 48
msghdr = {msg_name = 0x0, msg_namelen = 0, msg_iov = 0xb38fcc68, msg_iovlen = 1, msg_control = 0x0,
msg_controllen = 0, msg_flags = 0}
addrbuf = "Ě\217ł[Ź\032\bü\220a\b\020GsŽxÍ\217łňB\034\b\030GsŽ4\000\000\000ťś\037\bě\000\000\000
\214\023\bč\220a\b\bÍ\217ł\021Ř\031\bč\220a\bě\000\000"
attempts = 0
send_errno = 0
strbuf =
"ŘR\"Ž\037y\n\bŕBôˇ\000\000\000\0000\000\000\000P \032\b8Şŕ\202ž\000\000\0000\000\000\000\233\207\n\b@Ě\217ł\020GsŽ¸Ě\217łţV\033\b@Ě\217ł0\000\000\000Ŕ\000\000\000÷\n\b\000\000\000\000¨$YŽGę\224\000\000\000\000\000°\224a\bXUCüđ\000\000\000P \032\b\210W\003Žž\000\000\000đ\000\000\000\b\020ňˇ0\020ňˇ°\000\000"
#3 0x081c5f41 in socket_send (sock=0xae734710, dev=0xae035788, task=0xb0f0e5b8, address=0xae79739c, pktinfo=0x0,
---Type <return> to continue, or q <return> to quit---
flags=0) at socket.c:4291
io_state = <value optimized out>
have_lock = isc_boolean_false
ntask = (isc_task_t *) 0x0
result = <value optimized out>
#4 0x08136ae5 in resquery_send (query=0xafa08c60) at resolver.c:1921
fctx = (fetchctx_t *) 0xad698268
result = <value optimized out>
qname = (dns_name_t *) 0x0
qrdataset = (dns_rdataset_t *) 0x0
r = {base = 0xafa08cd4 "\2076", length = 48}
res = (dns_resolver_t *) 0xb0f0d008
task = (isc_task_t *) 0xb0f0e5b8
socket = (isc_socket_t *) 0xae734710
tcpbuffer = {magic = 0, base = 0x274, length = 2528, used = 135051914, current = 2909287056,
active = 2909373308, link = {prev = 0x94ea47, next = 0x0}, mctx = 0x88700f8}
address = (isc_sockaddr_t *) 0xae79739c
buffer = (isc_buffer_t *) 0xafa08c98
ipaddr = {family = 2, type = {in = {s_addr = 4232271192}, in6 = {in6_u = {
u6_addr8 = "XUCü?\000\000\000\b°đˇxĐ\217ł", u6_addr16 = {21848, 64579, 63, 0, 45064, 47088, 53368, 45967},
u6_addr32 = {4232271192, 63, 3086004232, 3012546680}}},
un =
"XUCü?\000\000\000\b°đˇxĐ\217ł)ĺ\031\b\bĆrŽh\203v\000\000\000\000˙˙˙˙\220Đ\217ł\bĆrŽř\001\000\000źĐ\217ł\000\000\000\000\bĆrŽŘĐ\217ł8Ń\217ł(Ń\217ł\000\000\000\0008Ń\217łĽÖ\034\b(Ń\217ł\000\000\000\000\001\000\000\000\004Ń\217ł8ĆrŽ\001\000\000\001"},
zone = 0}
tsigkey = (dns_tsigkey_t *) 0x0
peer = (dns_peer_t *) 0x0
useedns = 2909373032
cctx = {magic = 0, allowed = 0, edns = -1, table = {0x0 <repeats 64 times>}, initialnodes = {{r = {
---Type <return> to continue, or q <return> to quit---
base = 0xafa03320 "\003www\vcorrupttube\003com", length = 21}, offset = 12, count = 0, labels = 4 '\004',
next = 0x0}, {r = {base = 0xafa03324 "\vcorrupttube\003com", length = 17}, offset = 16, count = 1,
labels = 3 '\003', next = 0x0}, {r = {base = 0xafa03330 "\003com", length = 5}, offset = 28, count = 2,
labels = 2 '\002', next = 0x0}, {r = {base = 0x0, length = 0}, offset = 0, count = 0, labels = 0 '\0',
next = 0xb38fd044}, {r = {base = 0xad616b08 "NbdanSNDř\220a\023", length = 23}, offset = 53688,
count = 45967, labels = 244 'ô', next = 0xb38fd168}, {r = {base = 0xad616b0c "nSNDř\220a\023", length = 28},
offset = 65433, count = 19265, labels = 1 '\001', next = 0x1}, {r = {base = 0x0, length = 0}, offset = 53144,
count = 45967, labels = 104 'h', next = 0x0}, {r = {base = 0x4b41ff99 "ŰAK\200ŻFy(ç\020\035ćź\n",
length = 0}, offset = 27404, count = 44385, labels = 88 'X', next = 0x0}, {r = {
base = 0xa30f328 "\220\004", length = 9515960}, offset = 21358, count = 17486, labels = 104 'h',
next = 0x1}, {r = {base = 0x1 <Address 0x1 out of bounds>, length = 1}, offset = 53188, count = 45967,
labels = 68 'D', next = 0xffffffff}, {r = {base = 0xffffffff <Address 0xffffffff out of bounds>, length = 0},
offset = 0, count = 0, labels = 0 '\0', next = 0x0}, {r = {base = 0x0, length = 4104}, offset = 53416,
count = 45967, labels = 8 '\b', next = 0xb7f0c008}, {r = {base = 0xb38fd0a8 "", length = 3012546744},
offset = 8, count = 0, labels = 8 '\b', next = 0xb38fd0b8}, {r = {
base = 0x9538a4 "X=\001đ˙˙s\001Ăč\204˘˙˙\201ÁB\207", length = 8}, offset = 36582, count = 131,
labels = 7 '\a', next = 0xb38fd0a4}, {r = {base = 0x8 <Address 0x8 out of bounds>, length = 135914793},
offset = 50680, count = 2783, labels = 236 'ě', next = 0x3c0}, {r = {base = 0xb38fd0d8 "ŕ\t",
length = 2910225256}, offset = 50696, count = 44658, labels = 72 'H', next = 0x81bb002}}, count = 3,
mctx = 0x86190e8}
secure_domain = isc_boolean_false
#5 0x08137579 in fctx_query (fctx=0xad698268, addrinfo=0xae797398, options=0) at resolver.c:1505
res = (dns_resolver_t *) 0xb0f0d008
task = (isc_task_t *) 0xb0f0e5b8
result = 2927167124
query = (resquery_t *) 0x30
addr = {type = {sa = {sa_family = 61584, sa_data = "5\000i\003\000\000\220\006yŽ\bka"}, sin = {
sin_family = 61584, sin_port = 53, sin_addr = {s_addr = 873}, sin_zero = "\220\006yŽ\bka"}, sin6 = {
---Type <return> to continue, or q <return> to quit---
sin6_family = 61584, sin6_port = 53, sin6_flowinfo = 873, sin6_addr = {in6_u = {
u6_addr8 = "\220\006yŽ\bkaHŇ\217ł\000\000\000", u6_addr16 = {1680, 44665, 27400, 44385, 53832, 45967, 0,
0}, u6_addr32 = {2927167120, 2908842760, 3012547144, 0}}}, sin6_scope_id = 143405224}, sunix = {
sun_family = 61584,
sun_path =
"5\000i\003\000\000\220\006yŽ\bkaHŇ\217ł\000\000\000\000¨0\214\b\231˙AKÂ\005\000\000-o\020\büŇ\217ł\000\000\000\0008Ó\217ł!É\177\000\200
\221\000Ŕ
\221\000ô\037\221\000|\204i\000\000\000\000\fŇ\217łÖĹ\177\000\b#\221\000\004Ň\217łë?\017\017ô\037\221\000\030Ň\217ł\024Ě\177\000\n\r"}},
length = 3012547128, link = {prev = 0x81af974,
next = 0xb38fd2fc}}
have_addr = isc_boolean_false
srtt = <value optimized out>
#6 0x08137784 in fctx_try (fctx=0xad698268, retrying=isc_boolean_false) at resolver.c:3013
result = <value optimized out>
addrinfo = (dns_adbaddrinfo_t *) 0x0
#7 0x081b9221 in run (uap=0xb7f09008) at task.c:862
No locals.
#8 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#9 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 3 (process 11997):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x00950d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2 0x081cda66 in isc_condition_waituntil (c=0xb7f0a040, m=0xb7f0a010, t=0xb7f0a038) at condition.c:59
presult = <value optimized out>
result = 0
---Type <return> to continue, or q <return> to quit---
ts = {tv_sec = 1262616473, tv_nsec = 971230000}
strbuf =
"4\000\000\000ŕ\000\000\000ůë\031\bôż\225\000\030Ăď˛Gę\224\000\000\000\000\0000\225a\b\000\000\000\000Ý.\000\000č\220a\b(J.Ž\210äăŽ8Ăď˛\002°\033\bŘäăŽxJ.ŽHĂď˛\002°\033\bÉ\001\000\000\b°đˇxĂď˛Lć\031\b\210äăŽ(J.ŽxĂď˛\bç\031\b\210äăŽHť\005Žťś\037\b\210äăŽ\222\003\000"
#3 0x081bb583 in run (uap=0xb7f0a008) at timer.c:722
now = {seconds = 1262616473, nanoseconds = 955861000}
result = <value optimized out>
#4 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 2 (process 11998):
#0 0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x008a2376 in epoll_wait () from /lib/libc.so.6
No symbol table info available.
#2 0x081ca494 in watcher (uap=0xb7f0c008) at socket.c:3468
manager = <value optimized out>
cc = 1
strbuf = '\0' <repeats 127 times>
#3 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
Thread 1 (process 11995):
#0 0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752
---Type <return> to continue, or q <return> to quit---
No locals.
#1 0x0819e708 in isc_heap_delete (heap=0xb0f54068, index=1) at heap.c:218
elt = (void *) 0x0
less = <value optimized out>
#2 0x080e039f in free_rdataset (rbtdb=0xb0f4f008, mctx=0x864bea0, rdataset=0x59375628) at rbtdb.c:1273
size = <value optimized out>
idx = 3
#3 0x080e04c3 in clean_stale_headers (rbtdb=0xb0f4f008, mctx=0x864bea0, top=0x4af6f3e0) at rbtdb.c:1331
d = (rdatasetheader_t *) 0x4b726d
down_next = (rdatasetheader_t *) 0x0
#4 0x080e10c4 in decrement_reference (rbtdb=0xb0f4f008, node=0x411b1368, least_serial=0,
nlock=isc_rwlocktype_read, tlock=isc_rwlocktype_none, pruning=isc_boolean_false) at rbtdb.c:1348
result = <value optimized out>
write_locked = <value optimized out>
nodelock = (rbtdb_nodelock_t *) 0x88099a4
no_reference = <value optimized out>
#5 0x080ea711 in detachnode (db=0xb0f4f008, targetp=0xb42fe2e4) at rbtdb.c:4877
rbtdb = (dns_rbtdb_t *) 0x0
node = (dns_rbtnode_t *) 0x411b1368
inactive = <value optimized out>
nodelock = (rbtdb_nodelock_t *) 0x88099a4
#6 0x080ea9b1 in rdataset_disassociate (rdataset=0xb05c5a48) at rbtdb.c:7173
db = (dns_db_t *) 0xb0f4f008
node = (dns_dbnode_t *) 0x411b1368
#7 0x0812e55a in dns_rdataset_disassociate (rdataset=0xb05c5a48) at rdataset.c:101
No locals.
#8 0x08132f9e in fctx_destroy (fctx=0xb05c5988) at resolver.c:3081
res = (dns_resolver_t *) 0xb0f0d008
---Type <return> to continue, or q <return> to quit---
bucketnum = <value optimized out>
sa = (isc_sockaddr_t *) 0x0
next_sa = (isc_sockaddr_t *) 0x0
#9 0x0813548e in fctx_doshutdown (task=0xb0f0ea30, event=0xb05c59e0) at resolver.c:3246
fctx = (fetchctx_t *) 0xb05c5988
bucket_empty = <value optimized out>
res = (dns_resolver_t *) 0xb0f0d008
bucketnum = 28
validator = <value optimized out>
#10 0x081b9221 in run (uap=0xb7f09008) at task.c:862
No locals.
#11 0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#12 0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.
JINMEI Tatuya / 神明達哉 wrote:
> At Wed, 30 Dec 2009 10:23:17 +0100,
> Dario Miculinic <dario.miculinic at t-com.hr> wrote:
>
>> I'm administrating 4 DNS servers running CentOS release 5.4 and Red Hat Enterprise Linux Server release 5.2. with BIND
>> version 9.6.1-P1. On 3 of them BIND crashed 7 times in last 10 days. There's nothing in log files, but we have core dump
>> file. I found this in the core dump:
>>
>> #0 0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752
>> 752 ttl_sooner(void *v1, void *v2) {
>> (gdb) where
>> #0 0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752
>
> What's the result of the following gdb command?
>
> (gdb) thread apply all bt full
>
> We've seen crash like this one, but we've not figured out how this
> happens. This is pretty likely an inter-thread race, and it may be
> tricky. According to the v1/v2 values in your stack trace, a full
> backtrace with information of other threads may provide more useful
> hint.
>
> If you need immediate workaround rather than chasing the bug,
> rebuilding named with --disable-atomic may help (we cannot be sure
> because we don't yet know how this bug happens in the first place).
> This will use locks in a more conservative way and may avoid the
> tricky race condition at the cost of lower performance (so if you want
> to try that you'll also need to watch the server load).
>
> ---
> JINMEI, Tatuya
> Internet Systems Consortium, Inc.
>
More information about the bind-users
mailing list