forwarding to a child zone is different!!

Kevin Darcy kcd at daimlerchrysler.com
Thu Apr 26 00:37:59 UTC 2001


Brad Knowles wrote:

> At 5:41 PM -0400 4/25/01, Kevin Darcy wrote:
>
> >  Huh? I don't follow. You seem to be implying that being
> >authoritative makes cache
> >  pollution more likely. Seems like it should be the other way
> >around, i.e. if you're
> >  authoritative for a zone, then all of that data is of high
> >"credibility" and thus
> >  less subject to poisoning.
>
>         Unfortunately, that's not the way it works.  The reality is that
> it is possible to successfully lie to any caching nameserver, and get
> it to believe that the answer is correct.
>
>         If the server in question is caching-only (i.e., not
> authoritative for any zones), while this "lie" may be propagated to
> other servers, at the very least it shouldn't be handed out as an
> authoritative answer, so they can choose to believe it or not.  The
> cache may be polluted, but it doesn't get propagated.
>
>         If the server in question is authoritative, and the successful
> lie is for an answer within the authoritative zone (yes, it will be
> cached), then when this answer is handed out to other servers, it
> will be handed out and marked as authoritative, and other servers are
> largely forced to actually believe it.    The cache is now polluted,
> and because the server is authoritative for the zone in question, it
> is also getting propagated.

Any server which overlays authoritative data, i.e. data it loaded from a zone
file (other than glue), with data learned from a response is in blatant
violation of RFC 2181, Section 5.4.1. BIND 8 has not had this problem for a
number of years, and I'm sure BIND 9 does not have it either. Cache
pollution, as the name suggests, results from bad *cache* data, not from
authoritative data.

Moreover, even if a modern nameserver could be tricked into responding
authoritatively with spoofed data, the AA bit should not be used by clients
for *authenticating* DNS responses. Its purpose is primarily to provide
lame-server detection, not security. Any software that trusts authoritative
responses more than non-authoritative ones has a flawed security architecture
anyway. The lack of a decent way to authenticate DNS transactions was, after
all, one of the major driving forces for TSIG and DNSSEC; the AA bit is in no
way a substitute for such methodologies. I've recently fought out this point
with a major security-software vendor (their mail gateway product was only
accepting authoritative DNS responses, for "security reasons") and they
ultimately agreed with me that that made zero sense and changed the behavior
of their product. Can you cite any software that actually cares about the
AA bit from a security standpoint?

> >  From a query-latency standpoint, you really can't beat having all
> >of the data for a
> >  zone resident in memory at all times. So if you have the memory (my
> >boxes have 2Gb
> >  RAM apiece) and the zone-transfer overhead doesn't bring the box to
> >its knees (it
> >  doesn't), then being a slave for the most frequently-used zones
> >makes a lot of
> >  sense. Besides, I have to do it anyway for purposes of redundancy.
>
>         From the perspective of the overall latency to any one query, you
> may be right.  However, if your TTLs are set properly and your
> caching servers are operating correctly, then averaged over the life
> of a typical record, a caching server should give you 99.99...99% of
> the overall average latency performance that you'd get with the same
> server also being authoritative.

So it's 99.99...99% as good, and *never* better than an authoritative server,
unless the memory usage of all that authoritative data and/or the zone
transfers and/or serial-number checks themselves, slows down the
authoritative nameserver to the extent that it affects its query response
time. On the other hand, with a caching server, all of that caching activity
(creating entries, "refreshing" entries, scanning for expired entries and
purging them) could *also* slow down the machine, as much or more. Bottom
line: on a machine with sufficient capacity, answering from authoritative
data will never be worse, and may be better, than recursing and/or answering
from cache. So that's the option I choose for my heavily-used zones.

OBTW, that 99.99...99% figure may be applicable to ISP-type environments, but
not all environments. We, for instance, have rather short TTL's (typically 1
hour) in order to speed up change propagation. It is quite common for names
to be queried less frequently than hourly (many queries are driven by
processes that kick off daily, or for each 8-hour work shift), so our cache
hit ratio would be far smaller than 99%. Even in "popular" zones, the hit
ratios of some less popular names in those zones could easily be 50% or less.
When the server is authoritative for the zone, however, the hit ratio for
*all* names is effectively 100% no matter what.

>         Moreover, if you set up a local authoritative secondary on a
> separate machine (or a separate process on the same machine), the
> caching server will automatically prefer it for virtually all
> queries, and you avoid the risks of cache pollution.  This also makes
> it easier to scale the system in the future, should the load reach a
> level that exceeds what a single machine could handle.

It will only prefer it if it's a *registered* authoritative server. But the
machines I was talking about are all *stealth* slaves (do you think I'm going
to pollute my NS RRsets with dozens of plant nameservers??). So I'd have to
set up forwarding or stub zones to make that work. Blech. I really can't see
the point of setting up multiple nameserver instances on an intranet box,
sitting on a trusted network, just to effect a separation between recursive
and non-recursive function. You're introducing unnecessary administrative
complexity and extra overhead (since the queries which need to recurse from
one instance to the other now require processing by *two* instances/processes
instead of just one) for negligible gain.

As for scaling up, I'd do it the old-fashioned way: buy another machine,
configure it the same way, and point half of the clients to it. The hardware
investment is the same as your proposal, and ongoing administration is
easier/cheaper since both nameservers are configured identically. Repointing
clients to different nameservers is also easy, if you're using DHCP...

Why don't we just agree that separating recursive from non-recursive
functions is generally recommended on/between network boundaries, i.e. where
there are differing levels of trust wrt the respective networks, a higher
danger of cache pollution, denial-of-service attacks, etc., but outside of
that context, hybrid recursive/non-recursive configurations are often
appropriate.


- Kevin




More information about the bind-users mailing list