Recommendations for replacing a master server without breaking DNSSEC

Tony Finch dot at dotat.at
Wed Nov 24 12:26:10 UTC 2021


Ralph Seichter via bind-users <bind-users at lists.isc.org> wrote:
>
> How would you go about moving all functionality from Alpha to Beta,
> ideally with minimal downtime, and with the hard requirement of not
> breaking DNSSEC? How would one need to handle key material, zone
> signatures, journals, etc.?

There was this time when we had a hardware failure that took out our
primary DNS server. It looked like it was going to take a long time to
fix the failure, so I stood up a replacement primary on different
hardware, which was relatively quick using our Ansible playbooks.

So the new server had a copy of all the relevant secrets (DNSSEC private
keys, TSIG keys, ssh keys, ...) installed by Ansible, which meant that the
new server's zones would all validate OK. So it was able to rebuild all
the zones and sign them from scratch, then take over from the dead server.

Using the same keys makes the process much easier than trying to do a
key rollover at the same time. Don't make delicate ops work needlessly
tricky!

We also use `serial-update-method unixtime`, so I did not have to worry
about SOA serial number resets. If you are currently using the default
`increment` method you can switch to unixtime without having to worry
about wraps (tho `date` is problematic). Do this before the migration, to
remove another hazard.

When you are standing up a new primary, there are a few things you can do
to check that the new zones are OK: use https://dotat.at/prog/nsdiff/ to
verify that the non-dnssec zone contents are identical; use
`dnssec-verify` to check the DNSSEC parts.

A minor downside of rebuilding from scratch like this is that your
secondaries will have to retransfer the complete zone contents, but that
was not a problem at our scale (cam.ac.uk is 150,000 records and we have
similarly sized private and reverse DNS zones).

Basically, I ignored the journals as ephemeral, and I knew that re-signing
from scratch would generate working signatures even though they are all
different from the old signatures. Even if you are running both old and
new in parallel before the switchover, unixtime means you don't have to
worry about serial numbers either.

(The biggest mistake I made with this operational surprise was to rebuild
the primary on the same IP addresses rather than promote its sibling to
take over on different addresses. I chose to do it in place so I did not
have to reconfigure the other servers to point to the alternate primary;
instead I had to do some delicate and unscripted firewall adjustments to
stop the other servers from pulling down incomplete zones while the
rebuild was in progress. In retrospect that was the wrong choice. But
since you are moving to a new location I suspect you don't have this
hazard.)

I think a procedure like this is a good way to migrate a primary server if
the old and new servers are run by the same people, though I recommend
that you don't do it very quickly after a hardware failre if you can avoid
it.

Tony.
-- 
f.anthony.n.finch  <dot at dotat.at>  https://dotat.at/
Lundy, Fastnet, Irish Sea: West or northwest 3 to 5, veering north or
northeast 6 to gale 8. Smooth or slight, becoming slight or moderate,
then moderate or rough in Fastnet and later elsewhere. Showers. Good,
occasionally moderate.



More information about the bind-users mailing list