bind9 is taking little Breaks for Some Reason.

Clenna Lumina savagebeaste at yahoo.com
Tue Jun 26 18:30:21 UTC 2007


Martin McCormick wrote:
> I have done more testing and am almost certain that the
> problem lies in the box, itself though I am just as mystified as
> ever as to what is happening.
>
> The one diagnostic tool that spotlights the problem for
> sure is netstat run as:
>
> netstat -w5
>
> This prints counts of input packets and total bytes received as
> well as output packets, errors and bytes sent during a 5-second
> interval.
>
> In one test, I started the readings at exactly 16:00 on
> Thursday and let them accumulate all night with the following
> command:
>
> netstat -5w |tee counts
>
> Which prints both to the screen and to the file named counts.
>
> Next morning, I stopped the test and looked at the syslog of our
> DHCP server which is running on a different box and looked for
> the first "timed out" complaints.  There was 1 at 16:21 and a
> few seconds.
>
> I then stripped out the headers that netstat puts in so
> all that was left was columns of numbers and sorted the 6TH
> column which is bytes out.  In the entire roughly 15-hour
> period, there were several columns with only 178 bytes sent out
> the Ethernet interface in 5 seconds.  On our master DNS, there
> are normally tens to hundreds of kilobytes sent in 5 seconds.
> The first output drought occurred just 21 minutes after I started
> the test and coincided with the first "timed out" message.  Here
> is what the minute in which the  outage occurred looked like.
> The first and last lines are normal and then you see the hit.
>
>            input        (Total)           output
>   packets  errs      bytes    packets  errs      bytes colls
>      1372     0     145160       1246     0     206202     0
>       676     0      63491        217     0      33595     0
>       578     0      50200          6     0        560     0
>       647     0      55292          2     0        242     0
>       681     0      58581          1     0        178     0
>       763     0      65570          3     0        302     0
>       725     0      63723          2     0        246     0
>       781     0      66721          3     0        302     0
>       746     0      64504         15     0       1222     0
>       770     0      66112         14     0       1150     0
>       942     0      84863        298     0      88924     0
>      2419     0     256723       2234     0     381206     0
>
> Basically, the packets go in during one of these
> narcoleptic seizures and hardly anything comes back out for several
> seconds although the interface stays up.  After about a minute, the
> output comes back up to normal and the Sun comes out and the
> birds sing for anywhere from 30 minutes to a couple of hours and
> then another hiccup.
>
> As I said earlier, nothing is complaining about
> anything.  bind looks quite tranquil based on a rndc status
> that I ran off of an expect script triggered by the timeout
> messages.
>
> Martin McCormick WB5AGZ  Stillwater, OK
> Systems Engineer
> OSU Information Technology Department Network Operations Group

My guess would be either A) if it's just at that one box, a hardware 
issue (such as the nic) or B) if it's happening on multiple machines, 
then I would take a look at the switch, perhaps try another one and see 
if that changes anything.  IF it does then it could be an errant setting 
causing problems or some general hardware failure in the switch.

-- 
CL 




More information about the bind-users mailing list