BIND 9.6 freezing on update to signed zone (rare!)

Chris Thompson cet1 at cam.ac.uk
Mon Jul 20 19:03:20 UTC 2009


On Jul 15 2009, I wrote:

>We had an incident last night on the authoritative nameserver which
>is master for dnssec-test.csi.cam.ac.uk (a signed zone). At the time
>it was running BIND 9.6.1rc1 (but I doubt if 9.6.1 is going to make
>a difference). A script-generated update timed out, and it subsequently
>failed to respond to any DNS queries or rndc commands (although the
>named process was still running).
>
>It has to have been the update itself that caused this. (It had just
>previously processed updates to two unsigned zones perfectly). On
>the other hand, it had previously processed dozens of updates to the
>signed zone without any problems (it is maintained as an approximate
>clone of cam.ac.uk), and there wasn't anything unusual about this one.
>Indeed there was no problem re-applying it after BIND had been restarted.
>I am reduced to speculating about timing effects, e.g. collision with
>a re-signing event.
>
>Unfortunately I failed to get a core dump of named in the non-responding
>state (I need to review my procedures for that!) so I haven't got enough
>to report to bind-bugs. This is an appeal to ask if anyone has seen
>anything similar.

Some extra information - for the previous 14+ hours it had been logging
messages like this:

Jul 14 10:44:24 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
 general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 14 10:45:54 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
 general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 14 10:50:22 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
 general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 14 10:51:51 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
 general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 14 10:56:15 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
 general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
...
Jul 15 00:50:56 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
 general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 15 00:52:22 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
 general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 15 00:53:47 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
 general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 15 00:55:13 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
 general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact

But I am no nearer understanding what causes these. The zone had several
externally applied updates (apparently successfully) during this period,
before the one that hung.

-- 
Chris Thompson
Email: cet1 at cam.ac.uk



More information about the bind-users mailing list