SV: BIND 9.1.2 and TinyDNS???

Brad Knowles brad.knowles at skynet.be
Thu Jun 21 03:55:23 UTC 2001


At 12:42 AM +0000 6/21/01, Don Stokes wrote:

>  I can think of plenty of counter-examples.  The one that most springs to
>  mind is reverse map lookups on logs etc -- I've seen this done in batch
>  and make the name server process grow to several times its normal size,
>  full of data that simply won't be queried again.

	Right, but as these queries come in, they cause other data to be 
thrown away.  So, you end up throwing out all the stuff you're 
actually going to use again, and you end up keeping only the garbage 
that could fit into the tiny amount of RAM you've specified.

	You would be much, much better off either creating a dedicated 
nameserver for the processing of things like web logs (which will 
have a significantly different locality of reference than most of 
your regular queries), or making sure that the machine has enough 
memory to store both the stuff you're asking for and almost certainly 
won't need again before it times out, plus all your regular stuff.

>  Most of your arguments appear to describe a situation where every record
>  in the cache has an equal and high probability of being re-used.  Real
>  life ain't like that.

	Not at all.  Indeed, the case you bring up fits my complaint 
*perfectly*.  Thank you!

>  TTLs are not set based on client usage, but by the server administrator
>  at the primary.

	Yup, I'm aware of this.

>                   They have no way of knowing how long you really need to
>  keep your cache for, just how often they expect to update their DNS and
>  therefore the maximum time records should be kept, so a name that gets
>  queried once only will sit in the cache for maybe days, while another
>  name queried every hour will get queried every time because the TTL is
>  less than that.

	Right.  No surprise there.

>  LRU would help this.

	Ahh, but is the cache really flushed LRU?  I wouldn't bet on it. 
My guess is that they simply toss out the records that have the 
lowest TTL, and they keep doing so until they reach a certain 
threshold.

>                        Obviously, cache sizes need to be set in
>  accordance with load -- if you're constantly re-querying because the
>  cache is too small, you need to throw more memory at the problem.  But
>  that's no reason to let the cache grow out of control due to an
>  occasional piece of odd behaviour.  The lovely thing about an LRU
>  approach is that it will hit the unrequeried records with long TTLs
>  first, i.e. the ones you are least likely to re-use.

	Problem is, the first time something like this happens, you may 
not understand why your DNS resolution is so slow.  All you see is 
that queries seem to take forever.  What you're not seeing, and which 
is being totally hidden by the server, is the fact that the 
nameserver is critically short on memory, so it's throwing away 
everything in its cache in order to keep up with the log processing 
you're doing, including all the stuff you'd normally want to keep.


	This is a pernicious option that tends to hide serious problems 
until they become so serious that you can't possibly actually do 
anything useful to solve them -- you're just plain screwed.

	You'd be much better off if you discovered problems like this 
that were brewing, before they became really serious.  In this case, 
having your nameserver thrash around swapping and paging like mad is 
actually good, because that's something you can easily detect with 
tools like ps, top, vmstat, iostat, sar, etc....  Simply having a 
nameserver that is strangely slow but not for any reason you can 
detect is a much, much more difficult problem to detect and debug.

>  In real life, the cost of extra DNS queries is an extra packet each way,
>  typically well under half a kbyte.

	Not at all.  Bandwidth is nothing.  Latency is everything.  If 
each DNS query gets held up by another 500ms because the data 
effectively cannot be locally cached (because your nameserver is 
needlessly thrashing itself), this causes more programs to stack up 
in memory as they wait for DNS queries to return, more of those 
programs will decide that the DNS server is dead and retransmit, and 
then you start getting yourself into a very seriously nasty situation 
where you effectively can't do anything at all.


	Do you remember Black Tuesday a few years ago, where AOL went 
off-the-air for nineteen hours?  Do you remember what happened to 
e-mail across the entire Internet?  It stopped.  Dead.  AOL was 
advertising too many MXes in their DNS, causing DNS UDP response 
truncation.  These queries are supposed to be restarted with TCP. 
Only, AOL wasn't there, and TCP wasn't getting through.

	During the first hour of the blackout, you may have locally 
cached some values as to which machines to try to deliver mail to, 
but those TTLs were set low (so that, under normal circumstances, 
people would have to re-query frequently in order to get the latest 
data, and they'd almost certainly get a different list than the last 
one they'd gotten, thus resulting in a very crude load-balancing).

	But, if you tried to contact any of those mail servers, you got a 
connection timeout.  Since there were a total of nine names returned, 
each with five IP addresses, this would be forty-five IP addresses to 
chew through.  A typical TCP connection timeout is two minutes per 
connection, so you'd wait ninety minutes cycling through all 
forty-five IP addresses, before discovering that a single address was 
not deliverable.

	Now, imagine what happens when places all around the world queue 
up mail for AOL.  Moreover, they are almost all firing up a queue run 
every sixty minutes.  But this is at least a thirty minute overlap 
with the previous queue run, thus causing more and more and more 
copies of sendmail to get fired up and then effectively permanently 
hung in memory.  Repeat until there is no memory left on the system, 
and it crashes.

	Voila, you've taken down e-mail for the entire Internet.  Only, 
since I was the Sr. Internet Mail Admin at AOL at the time, I was 
personally blamed for this by many people around the world.  Some of 
those people got quite nasty, and made direct threats of physical 
violence against me.


	The situation your proposing is effectively the same.  The only 
difference is that you appear to actively want to do this to yourself 
(as opposed to the entire rest of the Internet), if some sort of 
unexpected run-away event should come along and overload your 
nameservers.

>  The cost of having the name server fail is at best several seconds to
>  fail over on a query if there are redundant caches, and total loss of
>  service if there aren't (or if all caches are flooded).  To me, these
>  are unacceptable.

	In my experience, if you ever fail over to a different caching 
server, you are completely and totally dead anyway.  The systems I am 
familiar with run enough volumes that in the thirty seconds it would 
take to fail a query over to the backup, the entire machine has 
effectively ground to a complete halt.

	This is why you set up a local caching nameserver on each and 
every machine, so that only if the machine dies itself should the 
local caching nameserver also die.  Even if you want to forward all 
unknown queries to a set of central caching servers, you take too 
much of a risk of failure (see pages 333-335 of chapter 11 of 4th 
edition of _DNS & BIND_ by Paul Albitz and Cricket Liu).

>  LRU doesn't solve my first example -- in that case, all cached records
>  would be older than the new ones and therefore flushed.  But if the
>  alternative was having to restart the name server process and therefore
>  flush the cache anyway, that would still be preferable, and LRU would
>  ensure that the new records were the first out the door (assuming they
>  weren't re-used) when the cache started to fill with useful data again.

	The key is the "R"ecently in LRU.  If they do actually track the 
last time a particular record was used, then the records that are 
referenced once and never referenced again should form a relatively 
small working set in the nameserver, and they should be the only ones 
to flush out each other, leaving the rest of the more frequently 
referenced records alone.

	Of course, this is the ideal.  The reality is probably worse, 
thus risking flushing all the recently referenced records down the 
drain, as all those garbage one-reference-only queries come in.


	Of course, all this cache flushing activity has to take a huge 
toll on the nameserver.  My guess is that if it ever gets to the 
point where it has to start flushing its cache, that's effectively 
the same as BIND starting to page and/or swap, and you're in a 
similar "dead meat -- you'll never recover from this" type of 
situation.

-- 
Brad Knowles, <brad.knowles at skynet.be>

/*        efdtt.c  Author:  Charles M. Hannum <root at ihack.net>          */
/*       Represented as 1045 digit prime number by Phil Carmody         */
/*     Prime as DNS cname chain by Roy Arends and Walter Belgers        */
/*                                                                      */
/*     Usage is:  cat title-key scrambled.vob | efdtt >clear.vob        */
/*   where title-key = "153 2 8 105 225" or other similar 5-byte key    */

dig decss.friet.org|perl -ne'if(/^x/){s/[x.]//g;print pack(H124,$_)}'


More information about the bind-users mailing list