named-xfer causes 100% CPU Utilization

Danny Mayer mayer at gis.net
Tue Oct 2 01:12:03 UTC 2001


At 05:39 PM 10/1/01, Kevin Vaughn wrote:
>I found another symptom of my problem and I was wondering if you could point
>me in the right direction.  I turned debugging on for my slave server and I
>noticed something peculiar in the named.run log.  Below is some of the
>output:
>
>*****
>01-Oct-2001 12:40:15.000 default: warning: zone transfer timeout for
>"pccatest.com"; pid 132 kill failed Errcode: 10035: Errcode: 10035: Errco
>
>01-Oct-2001 12:40:45.000 default: warning: zone transfer timeout for
>"pccatest.com"; second kill pid 132 - forgetting, processes may accumulate
>.
>.
>.
>01-Oct-2001 14:05:03.000 default: notice: named-xfer for "pccatest.com"
>exited 1
>*****
>
>This is why the processes are piling up and not dying on their own; BIND
>can't kill them.  Why are they failing (timing out) in the first place?  The
>initial zone transfer works when the service is initially started.  I have
>turned off NOTIFY to see if the problems go away.  If I manually run
>/dns/bin/named-xfer -z pccatest.com -f /temp/testzone.db -d 3 -s 0
>pcnwpnstst, I still have a problem with named-xfer hanging and then
>multiplying itself over and over again.  The last line above says it exited
>with 1.  Doesn't a exit code of 1 mean that the transfer completed
>successfully?  I have looked at the log for named-xfer, but all of the
>transfers complete with no errors.  I have searched every log I can think of
>... I don't know what to do.

The key to the problems are on the master and not the slave.  The above is just a
symptom of the problem.  Upgrade to BIND 8.2.5-REL when it gets announced.
It won't totally solve the problem, but you won't have named-xfer processes chewing
up CPU.  If you want to avoid the problem altogether, install BIND 9.2.0rc5 as
soon as it's announced.

>I have taken the advice of Danny Mayer and bumped my virtual memory on both
>servers to 500MB.  I also replaced allow-update with allow-transfer, which
>was causing an error to be generated.  While both of these suggestions
>helped, the original problem still persists.

The real problem is still on the master.

         Danny



More information about the bind-users mailing list