Forwarding to a Cache

Wed Feb 22 03:34:42 UTC 2006

Kevin Darcy wrote:

>Kimi Ostro wrote:
>
>  
>
>>On 03/02/06, Kevin Darcy <kcd at daimlerchrysler.com > wrote:
>> 
>>
>>    
>>
>>>Not quite. My answer is "try it and see". In some situations, forwarding
>>>opportunistically (i.e. forward "first" instead of forward "only") makes
>>>sense from a performance standpoint.
>>>   
>>>
>>>      
>>>
>>Sorry for the lengthy delay in replying.
>>
>>I see, well I am not really looking so much from a performance side of
>>things to be honest, although still it is nice to know. I have always
>>thought there was too much emphasis on performance, more interested in
>>integrity and stability.
>>
>>There are a number of factors that
>> 
>>
>>    
>>
>>>play into this, but they basically boil down to cache-hit ratios, the
>>>reliability/performance of your forwarders, and the latency of your
>>>connectivity to those forwarders, relative to your latency, and their
>>>latency, to the Internet at large.
>>>   
>>>
>>>      
>>>
>>hmmm.. this is probably where that lapse in lookups comes from every once in
>>a short-while. In most cases looks are in milliseconds, but there are times
>>when it can take 4/5 seconds to do a lookup.
>>
>>To expand a little bit on cache-hit
>> 
>>
>>    
>>
>>>ratios: if the names you're looking up are very popular ( e.g. yahoo.com,
>>>msn.com , google.com, chrysler.com :-) with the forwarder's other
>>>clients, and happen to have reasonable TTL values, then the chances of a
>>>particular name being in the cache are high, and when the answer comes
>>>      
>>>
>>>from cache, from a non-overloaded nameserver, with good, non-overloaded
>>    
>>
>>>network infrastructure between that nameserver and yours, then it's
>>>going to be fast. Sometimes cache-hit ratios can be affected by things
>>>like timezone, e.g. one of the forwarder's big clients is in a time zone
>>>1 hour ahead of yours, so their users logging on in the morning populate
>>>the forwarders' cache with, say, www.ebay.com , etc. entries and so the
>>>cache is "hot" by the time your users are logging on, and you reap the
>>>benefits of their query-latency times. Bear in mind that since you're
>>>caching locally, in addition to your forwarders caching, and since DNS
>>>is designed for all cache entries derived from the same lookup from an
>>>authoritative nameserver, to time out simultaneously, that TTL values do
>>>not primarily determine the cache-hit-ratio benefit you get by
>>>forwarding.
>>>   
>>>
>>>      
>>>
>>I think this is where I haven't explained my self properly and shot my email
>>off before giving more details --naughty me.
>>
>>Basically, at the moment my setup contains three nameservers, two are
>>authoratative (master/slave) and one is a recursing cache. I done away with
>>forwarding in the sense that using the ISPs DNS servers is probably not the
>>best way after reading lots of information on how DNS is supposed to work
>>and things like DNS poisoning, blah blah.
>>
>>So basically the caching nameserver forwards all querys for my local network
>>to my authoratative slave as I set it as the default in all the resolvers.
>>Obviously now, if the authoratative servers don't have the information, the
>>caching server will want to find the query if possible from else where, and
>>so the story goes on.
>>
>>cache.kimi.home 's named.conf:
>>view "internal.kimi.home" {
>>
>>zone "." {
>> type hint;
>> file "master/root.cache";
>>};
>>
>>zone "localhost" in {
>> type master;
>> file "master/mst.localhost.db";
>>};
>>
>>zone "0.0.127.in-addr.arpa" in {
>> type master;
>> file "master/mst.loopback.rv";
>>};
>>
>>zone "kimi.home" in {
>> type forward;
>> forwarders { 192.168.1.212; 192.168.1.211; };
>>};
>>
>>zone "1.168.192.in-addr.arpa" in {
>> type forward;
>> forwarders { 192.168.1.212; 192.168.1.211; };
>>};
>>
>>};
>>
>>This was the only working config I could get working near to the way I
>>"wanted".
>>
>>Now what I was thinking of doing this:
>>
>>new master.kimi.home's named.conf:
>>view "internal.kimi.home" {
>>
>>zone "." {
>> type forward;
>> forwarders { 192.168.1.210; };
>>};
>>
>>zone "localhost" in {
>> type master;
>> file "master/mst.localhost.db";
>>};
>>
>>zone "0.0.127.in-addr.arpa" in {
>> type master;
>> file "master/mst.loopback.rv";
>>};
>>
>>zone "kimi.home" in {
>> type master;
>> file "master/mst.kimi.home.db";
>>};
>>
>>zone "1.168.192.in-addr.arpa" in {
>> type master;
>> file "master/mst.kimi.home.rv";
>>};
>>
>>};
>>
>>the forwarder in the declaration is only a recursive caching nameserver,
>>itself does no forwarding to the ISP's dns servers.
>>
>>If you're the *only* client looking up a particular name
>> 
>>
>>    
>>
>>>through a particular forwarder, for example, then it doesn't matter
>>>whether the TTL is set to 1 second or 1 year, your cache entry for that
>>>name, and the forwarder's cache entry, are going to time out
>>>simultaneously, so there is going to be a latency hit on the next lookup
>>>(obviously the TTL values are, as always, going to have a *local* impact
>>>in terms of the latency-versus-memory tradeoff). What is more important
>>>than TTLs in an absolute sense, is whether the timing and frequency of
>>>the forwarders' other clients' lookups, *relative* to the pertinent TTL
>>>values, is going to result in them taking the latency hits for you to
>>>populate the forwarders' cache in such a way as to speed up your
>>>lookups. That's going to be a highly-situational thing, something that
>>>is very subject to tuning and, in extreme cases, perhaps even a bit of a
>>>Prisoner's Dilemma (e.g. if they have a big batch job that looks up a
>>>bunch of domain names, and you have a similar batch job looking up many
>>>of the same names, and you schedule your job to follow theirs by a
>>>particular time increment, in order to benefit from the cache-hit ratio
>>>that their batch job enhances, they may eventually give up forwarding
>>>because it's not giving them any benefit, and you both may end up having
>>>to fend for yourselves).
>>>   
>>>
>>>      
>>>
>>Being that I am still learning DNS and more so BIND, I haven't managed to
>>work out how other people get this to work. There has to have been other
>>people who wanted a similar setup? that being one non-forwarding recursive
>>caching nameserver for all the querys that cannot be answered by the
>>authoritative nameservers for the internal network, with clients/resolvers
>>only having the authoritative nameservers listed.
>>
>>      [.]
>>       |
>>    [cache]
>>    /      \
>>[master] [slave]
>>     |    |
>>    [client]
>>
>>But, more often than not, forwarding is used inappropriately, without
>> 
>>
>>    
>>
>>>the proper testing and measurement to see if it is warranted or not.
>>>Folks often forward for bureaucratic reasons, because of legacy
>>>configurations left over from a time when their Internet connectivity
>>>was less reliable than it is today, or simply because of bad habits or
>>>ignorance ("we know this configuration works, but we don't know why and
>>>don't want to try anything else, for fear of breaking stuff").
>>>
>>>      
>>>
>You don't have enough boxes in your infrastructure to enforce the 
>recommended absolute separation between resolution (recursive) and 
>hosting (non-recursive) functions *as*well*as* provide redundancy for 
>both functions (you'd need at least 4 boxes for that). So, you're going 
>to have to compromise either the ironclad separation or the redundancy. 
>If it were me, I'd rather sacrifice a little separation for redundancy, 
>so I'd configure all 3 boxes ("master", "slave" and "cache") as 
>recursive resolvers in one view, and the "master" and "slave" boxes as 
>non-recursive in a separate view. That's not a "perfect" separation, but 
>good enough IMO. To elaborate a little more on the design I have in 
>mind: divvy up the clients between the 3 boxes as a (crude) 
>load-balancing measure (all 3 resolvers would be listed in the 
>stub-resolver lists, of course, the only difference would be the order 
>in which they were listed). I wouldn't use forwarding at all, since it 
>doesn't seem to add any value.
>
Actually, if you could get someone to serve as your off-site slave 
(sometimes called "secondary" in older terminology), that would give you 
4 boxes total and the best of both worlds: you could dedicate your 
"master" box to that particular function, while not sacrificing any 
redundancy, and deploy your other 2 on-site boxes as a redundant pair of 
resolvers for your clients.

                                                               - Kevin