Recommended setup with large cache memory
Brad Knowles
brad at stop.mail-abuse.org
Fri Sep 9 14:15:58 UTC 2005
At 2:26 PM +0200 2005-09-09, Attila Nagy wrote:
> Do you have benchmarks between Linux and FreeBSD, with and without
> threading? As you say, the above is old, both parties have evolved since.
No, I don't generally have much to do with Linux. There are some
machines I help administer that run Linux, but I'm just one of many
people, and my role there is very limited. I don't run Linux inside
the house, and even if I were to try to install it on my machines, I
have zero confidence that I'd be able to create a configuration that
would be able to perform as well as it should.
>>> See http://www.danga.com/memcached/ for details.
>>
>> I am familiar with memcached.
>
> What is your opinion on using that to store the cached data?
I see no advantage to using it over the in-memory database that
BIND already uses, which is optimized for maximum speed and security,
because you have to make sure to keep Chinese Walls between the
authoritative data and the cached data.
It would certainly be a huge improvement if BIND was having to go
to files on disk for each and every query, but that's not the way
BIND works.
> Should I care? It's a cache. If the needed records are not available, it
> goes out to the network and do a query.
> Is this a SPOF?
Well, to the degree that you care about any of the information on
any of those systems, yes.
> Sorry, but I don't understand this. Do people get big, expensive drive
> arrays for squid caches?
What do you think a NetApp NetCache is? It's a big expensive
RAID array with a custom-written caching web proxy that sits in front
of it. The difference is that the NetApp guys are smart, and their
Write-Anywhere-File-Layout (WAFL) is optimized for use on a RAID-3
environment, and they've optimized everything else so that the parity
disk doesn't melt and isn't a bottleneck.
Otherwise, you'd have to run squid on a pretty huge honking
machine with some damn expensive EMC or Hitachi storage area network
devices.
> Is an entry, which can be retrieved from the network anytime valuable,
> which needs extra protection besides its integrity?
If you care anything about latency, yes. Otherwise, you might as
well not run any kind of cache at all, and just always get all your
data across the network.
> if you have a virtual IP address with a load balancer, which routes the
> queries to a number of caches you will have inconsistency in the answers.
> For example it will be possible that the first query for mx.domain.com
> will be negative (because in one of the caches there is an entry for it)
> and the next one will give an IP address.
This depends on the load-balancing switch that you're using.
Some switches can be configured for affinity priority, so that a
given query coming in for a given target will always go to the same
back-end server, or set of back-end servers, unless they are down or
have been taken out of the rotation. In which case, the query will
be directed to one of the other back-end servers, and that machine
will now continue to have the affinity for queries for that target.
> With a simple queryperf "benchmark" I could do about 35k qps on an UP
> machine, if I query the same (cached) A record. This performance doesn't
> really changes with the cache size in use.
You can get those kinds of numbers if you test with a copy of
much larger zones, too. I used .tv in my testing, specifically
because it was guaranteed to be larger than the physical memory of
the server I was working with. But higher-end boxes are going to be
able to keep all of .com or .tv in memory, at which point those kinds
of query rates could continue to be expected.
> BTW, a commercial product could handle the same (production) load with
> about 3% of CPU usage, while bind still ate about 30-40% on that machine
> (after upgrading to 9.3.1).
Yeah, products like Nominum's ANS and CNS are designed for much
higher performance. They are next-generation programs, way beyond
the capabilities that you could hope to have in any open-source
product. And you pay money to get that kind of speed.
> I've already seen your paper. I think it would be interesting to repeat
> that experiment.
I'm in the process of doing that, but I'm not going to try
testing different OSes, because I know that I'm not qualified to do
that.
--
Brad Knowles, <brad at stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
SAGE member since 1995. See <http://www.sage.org/> for more info.
More information about the bind-users
mailing list