How Zone Files Are Read

Mark Andrews marka at isc.org
Wed Dec 16 23:53:26 UTC 2020



> On 17 Dec 2020, at 06:44, Timothe Litt <litt at acm.org> wrote:
> 
> 
> On 16-Dec-20 13:52, Tim Daneliuk wrote:
>> On 12/16/20 12:25 PM, Timothe Litt wrote:
>> 
>>> On 16-Dec-20 11:37, Tim Daneliuk wrote:
>>> 
>>>> I ran into a situation yesterday which got me pondering something about bind.
>>>> 
>>>> In this case, a single line in a zone file was bad.  The devops automation
>>>> had inserted a space in the hostname field of a PTR record.
>>>> 
>>>> What was interesting was that - at startup - bind absolutely refused
>>>> to load the zone file at all.  I would have expected it to complain
>>>> about the bad record and ignore it, but load the rest of the
>>>> good records.
>>>> 
>>>> Can someone please explain the rationale or logic for this?  Not complaining,
>>>> just trying to understand for future reference.
>>>> 
>>>> TIA,
>>>> Tim
>>>> 
>>> DNS is complicated.  The scope of an error in a zonefile is hard to determine.
>>> 
>>> To avoid this, your automation should use named-checkzone before releasing a zone file.
>>> 
>>> This will perform all the checks that named will when it is loaded.
>>> 
>>> 
>> 
>> Kind of what I thought.  Whoever build the environment in question
>> really didn't understand DNS very well and hacked together a kludge
>> that I am still trying to get my head around.
>> 
>> 
> For a simple example of why it's complicated - what if the typo you had was for a host that sends e-mail?
> 
> You'll see intermittent delivery errors when remote hosts can't resolve the host's address; some require that a reverse lookup resolve to the host as an anti-spoofing measure.  Others won't.  You'll spend a long time diagnosing.
> named can't tell this case from a typo for a local printer's PTR - where it's unlikely that a reverse lookup failure will matter.  Of course, this means it could go undetected for years - until it IS needed.
> 
> Or the typo is in a NS record - which you probably won't detect until the other NS goes down...
> 
> And, any errors are cached for their TTL by resolvers.  The TTL may (hopefully for query rate reduction) be large.  In your case, it would be the negative TTL (meaning that even adding the record later wouldn't have immediate effect).
> The bottom line is that named must assume that anything placed in a zone file is important, and that the external impact - either sin of omission or commission - might be large.
> 
> Thus, while named can't detect all (or even most) errors, those that it does detect cause immediate failure to load.  That prevents caching and propagation as well as getting human attention.
> When something's wrong, it's best to stop and fix it.  Error recovery is a very good thing - but only when you can demonstrate that the cure is better than the disease.  Skipping format errors in a zone file would not satisfy that constraint.
> Timothe Litt
> ACM Distinguished Engineer
> --------------------------
> This communication may not represent the ACM or my employer's views,
> if any, on the matters discussed. 

And on top of all this there is STD 13 (RFC 1034, RFC 1035) which says
in RFC 1035:

"5.2. Use of master files to define zones

When a master file is used to load a zone, the operation should be
suppressed if any errors are encountered in the master file.  The
rationale for this is that a single error can have widespread
consequences.  For example, suppose that the RRs defining a delegation
have syntax errors; then the server will return authoritative name
errors for all names in the subzone (except in the case where the
subzone is also present on the server).

Several other validity checks that should be performed in addition to
insuring that the file is syntactically correct:

   1. All RRs in the file should have the same class.

   2. Exactly one SOA RR should be present at the top of the zone.

   3. If delegations are present and glue information is required,
      it should be present.

   4. Information present outside of the authoritative nodes in the
      zone should be glue information, rather than the result of an
      origin or similar error."

Those of use with long memories have seen all the errors scenarios
reported here play out in real life because early versions of BIND
did just drop bad lines and continue on as “best effort".  We fixed
this behaviour over 2 decades ago now with no regrets other than we
didn’t fix it sooner.

The above list of thing to check is not exhaustive.  BIND checks much
more these days.

Mark
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742              INTERNET: marka at isc.org



More information about the bind-users mailing list