Recommended setup with large cache memory

Fri Sep 9 10:30:47 UTC 2005

At 12:13 PM +0200 2005-09-09, Attila Nagy wrote:

>  BTW, I am not the only one:
>  http://lists.freebsd.org/pipermail/freebsd-current/2004-December/044565.html

	First off, Jinmei only tested BIND 9.3.0, not 9.3.1.  There have 
been a number of improvements made in 9.3.1, some of which may have 
come from his testing.

	Secondly, he tested on FreeBSD 5.3, and there was a paradigm 
shift in the way FreeBSD handled SMP between 4.x and 5.x, a process 
which is not expected to be mostly complete until we get sometime 
into 6.x-RELEASE.  Meanwhile, you should be using Linux instead, as 
Jinmei himself shows at the bottom of that post.  It seems that Linux 
handles the mutexes that BIND uses much better than FreeBSD does.

	But thank you for pointing me at this post.

>  Nope. What I am talking about is dropping the current memory allocator
>  out of bind and replace the store and get procedures (again, I did not
>  look at the code, so if the design is not so clean, it may be harder)
>  with -for example- memcached calls.

	Feel free to make any source code modifications you want, but 
please at least submit your code back to ISC for their consideration.

>  See http://www.danga.com/memcached/ for details.

	I am familiar with memcached.

>  You can run and use multiple memcached machines and if one fails, there
>  is no problem. There is no SPF, if I am right.

	You mean SPOF?  Yes, for the data you've lost, there is 
definitely a SPOF -- the machine that crashed.

>  That's what we are using. But if we -say- have four machines with four
>  gigs of RAM in each of them, this is simply wasting of resources.

	And people who look at their big expensive drive arrays with all 
those disks think that they are wasting everything, if they don't 
stripe all their data across them all.

	Everyone always seems to ignore reliability and tries to shoot 
for absolute maximum performance -- until they have a catastrophic 
failure.  Then they wish they'd gone for high availability and fault 
resilience, instead.

>  If we could have four caches with 512 MB of RAM and four machines with 4
>  gigs, I would have 16 GB of cache, which is unique to the whole cluster,
>  so there is no multiple instances of the same data, and there is no such
>  a problem that my nameserver of the IP 1.1.1.1 gives different answers
>  for subsequent queries.

	Uhh, excuse me?  How are you calculating these numbers?  How are 
you coming to these conclusions?

>  Also, if a backed server disappears, I don't care about the loss of 4
>  GBs of cached RRs. What is important to maintain the working state, which
>  is the case in this setup.

	Hell, yes -- you do care about the loss of that data.  And where 
do you think that the working state is being maintained, anyway?

>  Of course if you have fewer machines, it doesn't really matter. But
>  bind isn't so fast, so you have to drop a lot of machines for it and
>  storing the same in a lot of machines without showing a consistent
>  view to the customers is bad. :(

	BIND can be plenty fast.  See Jinmei's post that you yourself 
quoted.  Rick Jones has also gotten some very high performance out of 
BIND.  Multiple different sources have pushed it to do 20-30k queries 
per second, or more.

	Yes, there are programs out there that are faster than BIND (see 
my own performance testing at 
<http://www.shub-internet.org/brad/papers/dnscomparison/>), but BIND 
can still do quite nicely.

-- 
Brad Knowles, <brad at stop.mail-abuse.org>

"Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety."

     -- Benjamin Franklin (1706-1790), reply of the Pennsylvania
     Assembly to the Governor, November 11, 1755

   SAGE member since 1995.  See <http://www.sage.org/> for more info.