Ongoing CPU usage issues...

Kelsey Cummings kgc at sonic.net
Mon Apr 25 17:09:02 UTC 2005


All three of my primary name servers went into the CPU peg state overnight.
I wasn't really prepared to get detailed debugging information from them
while in the semi-broken state but I did grab an strace -c from one of the
servers while it was broken and then again after I restarted it.  I
couldn't let it run too long and this information doesn't mean to much to
me but maybe it'll help someone else.

While one of the threads was maxing the CPU (I really should have let this
run longer.)

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 90.65   10.071460        2643      3811      3811 rt_sigsuspend
  4.58    0.509362          48     10517           sendmsg
  1.96    0.218140           5     47573           gettimeofday
  1.75    0.194695          32      6178           write
  0.86    0.095313           6     16498      6172 recvmsg
  0.08    0.008782           2      3812           rt_sigprocmask
  0.07    0.007632           2      3811      3811 sigreturn
  0.02    0.002156          20       106        53 utime
  0.01    0.001192          99        12           send
  0.01    0.001108          17        65           kill
  0.00    0.000047           2        24           rt_sigaction
  0.00    0.000046          15         3           accept
  0.00    0.000023           2        12           time
  0.00    0.000020           2        12           getpid
  0.00    0.000020           2         9           fcntl64
  0.00    0.000007           2         3           close
------ ----------- ----------- --------- --------- ----------------
100.00   11.110003                 92446     13847 total

real    0m45.105s
user    0m0.760s
sys     0m1.680s

After restarting bind:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 95.63   67.550079        2598     25996     25996 rt_sigsuspend
  2.22    1.571516          49     31969           sendmsg
  1.05    0.742186          26     28402           write
  0.45    0.317469           2    199020           gettimeofday
  0.33    0.232188           4     59225     27813 recvmsg
  0.08    0.055330        1581        35           fsync
  0.07    0.046812           2     25997           rt_sigprocmask
  0.07    0.046250           2     25997     25997 sigreturn
  0.04    0.029707          11      2710      1355 utime
  0.03    0.020598         108       190           send
  0.02    0.011762          21       548           kill
  0.00    0.003050          87        35           rename
  0.00    0.002125          61        35           open
  0.00    0.001430          40        36        36 connect
  0.00    0.000723           2       380           rt_sigaction
  0.00    0.000723           2       296           fcntl64
  0.00    0.000610           3       233           brk
  0.00    0.000604           5       122           close
  0.00    0.000571          16        36           socket
  0.00    0.000426           8        51           accept
  0.00    0.000309           2       190           time
  0.00    0.000301           9        35           old_mmap
  0.00    0.000268           1       190           getpid
  0.00    0.000257           7        35           munmap
  0.00    0.000216           6        36           bind
  0.00    0.000157           4        35        35 rmdir
  0.00    0.000096           3        36           getsockopt
  0.00    0.000087           2        35           getsockname
  0.00    0.000086           2        36           setsockopt
  0.00    0.000083           2        35           _llseek
  0.00    0.000080           2        35           fstat64
------ ----------- ----------- --------- --------- ----------------
100.00   70.636099                402011     81232 total

real    2m18.274s
user    0m3.180s
sys     0m6.060s


If someone posts some clear instructions on what steps to take to gather
the needed information to help track down this bug I'll do my best to
follow them next time my bind boxes act up.

Anyone having these problems on an OS besides Linux or FreeBSD?  Maybe it's
a gcc/glibc or pthreads problem?  Does ISC or anyone else recomend a
specific compiler/library version for compiling bind?  Perhaps versions not
to use?

-- 
Kelsey Cummings - kgc at sonic.net           sonic.net, inc.
System Architect                          2260 Apollo Way
707.522.1000 (Voice)                      Santa Rosa, CA 95407
707.547.2199 (Fax)                        http://www.sonic.net/
Fingerprint = D5F9 667F 5D32 7347 0B79  8DB7 2B42 86B6 4E2C 3896



More information about the bind-users mailing list