debian11 + bind-9.16.15 + dnssec-policy = lost zonefiles + crashes

raf bind at raf.org
Mon Aug 16 02:28:47 UTC 2021


On Sun, Aug 15, 2021 at 10:35:27PM +1000, raf <bind at raf.org> wrote:

> But the real problem is that bind crashed, and dumped
> core, and couldn't start at all. There were a hectic
> few minutes there. :-) I deleted the coredump and the
> key files, and the .jnl files, restored backup
> zonefiles, updated the serials to be greater than that
> of the new DNSSEC signed zones, and then bind was able
> to start again.
> 
> Does anyone have any idea why bind-6.19.15 would have
> crashed repeatedly?

I've had a sleep, looked in the logs, and found this:

  general: notice: all zones loaded
  general: notice: running
  general: critical: rbtdb.c:6780: REQUIRE(((rbtnode->nsec == DNS_RBT_NSEC_NSEC3 && (rdataset->type ==
    ((dns_rdatatype_t)dns_rdatatype_nsec3) || rdataset->covers == ((dns_rdatatype_t)dns_rdatatype_nsec3)))
    || (rbtnode->nsec != DNS_RBT_NSEC_NSEC3 && rdataset->type != ((dns_rdatatype_t)dns_rdatatype_nsec3) &&
    rdataset->covers != ((dns_rdatatype_t)dns_rdatatype_nsec3)))) failed, back trace
  general: critical: #0 0x558ce49ffeed in ??
  general: critical: #1 0x7fd079be6d9a in ??
  general: critical: #2 0x7fd079d7f73c in ??
  general: critical: #3 0x7fd079e45680 in ??
  general: critical: #4 0x7fd079c1b720 in ??
  general: critical: #5 0x7fd079c20f52 in ??
  general: critical: #6 0x7fd07995cea7 in ??
  general: critical: #7 0x7fd079590def in ??
  general: critical: exiting (due to assertion failure)

That assertion failed 13 times before I cleaned up.

Perhaps this is an old bug that's been fixed by now.

The only problems logged in the lead up to these
assertion failures were permission errors trying to
create jnl files in /etc/bind for the zones that
shouldn't have been signed anyway, e.g.:

  general: error: /etc/bind/db.empty.jnl: create: permission denied
  general: error: /etc/bind/db.255.jnl: create: permission denied

AppArmor prevented it, but the directory permissions
would have also prevented it (drwxr-sr-x root bind).

I'm convinced that the dnssec-policy usage directive
doesn't belong in the options {} stanza, and should
only appear in zone {} stanzas.

As for testing that approach on a separate VM, the
behaviour is very different, and completely wonderful.
Instead of overwriting my source zone files and then
crashing, it has created ZONE.jbk, ZONE.signed, and
ZONE.signed.jnl files, all of which are binary. But
last night, I definitely saw the overwritten ZONE files
as a text version of the signed zone. Wierd. Never
mind.

So it's looking good and I'm happy now. But how long
after the zone has been signed can I expect to see
CDS/CDNSKEY RRs appear? Why aren't they created at
the same time as the DNSKEY RRs? I assume there's
a good reason but I can't think what it is.

Also, please document the dangers of putting a
dnssec-policy usage directive in the options {} stanza
(unless something signficant has changed since version
9.16.15, and bind now knows not to sign zones that
really shouldn't be signed locally - but even if that's
the case, you could document what version that changed in).

Thanks again for making DNSSEC so easy to implement
(as long as you avoid classic rookie errors). :-)

cheers,
raf



More information about the bind-users mailing list