DNS performance testing - FreeBSD & Solaris - BIND & djbdns
Matt Simerson
mpsimerson at hostpro.com
Thu Jun 21 21:29:02 UTC 2001
The reference hardware for FreeBSD systems is the HP LPr with a single
700MHz PIII CPU and 1GB RAM. Solaris Sparc** systems are E420's with 2GB of
RAM and a single 450MHz CPU.
NOTES:
* Much of the Solaris testing is incomplete due to difficulties first with
hardware (due to shipping damage), OS tuning (requiring help from Sun
Engineers), and a bonehead error by a sysadmin costing several days
necessary for the Solaris machines to chew threw the Class A testing.
** It must be noted that we don't have a tremendous amount of Sun expertise
within our department. The Sun's arrived, preconfigured from the guys in our
company who do have lots of expertise with them and I installed dnscache,
BIND 8, and top, and a few other utilities to monitor the system. Tuning was
done by them with the help (and lack thereof) of a couple Sun engineers.
*** This testing is almost entirely limits checking. It is in addition to
the tests I posted earlier to this list reflecting the real world
performance testing I was doing. The following is exerpts from the QA test
documents produced as part of our testing.
5. Features/Items To Be Tested
Number Name Description
1 Caching Performance Publicly available DNS servers will NOT be
caching servers. Caching servers will be available ONLY to dns requests from
our IP space.
2 Query performance How many queries can each server reliably
serve in a given unit of time?
3 Propagation time How long it takes from the initial entry of
a DNS change before that change is available on the public internet
4 Uptime How reliable is the server? Are there any known DoS attacks?
5 Security A comprehensive security audit will be performed on
each system. Host and application security will be tested thoroughly.
7 System manageability How easy is it for a system administrator to
make changes? How much time is necessary for administration tasks on a
routine basis?
8 Scalability Use load balancers to determine how load scales
across multiple servers.
6. Testing Strategy / Approach
1. Caching performance.
For testing caching performance, we'll use a dns client program named
dnsfilter running on 1 to 3 client machines to rapidly query the name
servers. Dnsfilter takes a list of IP addresses as input and outputs the
result as a list of IP addresses and the hostnames returned by the DNS. I
wrote a program that generates an input file suitable for dnsfilter. An
input file named "iplist-a" will contain an entire Class A block of IP
addresses and a second file named 'iplist-b" will contain an entire Class B
block of IP addresses.
A dnswall has been set up on 216.122.69.110. The program is named walldns
and is capable of answering over 6,000 dns requests per second. This is well
beyond the limits of all the dns servers and caches that we are testing.
Each caching name server (dnscache, BIND 8, BIND 9) will be configured to
forward all requests to the dnswall. The resolv.conf file on the caching
name servers will contain no "search" or "domain" keywords and will contain
only one nameserver entry that points all DNS resolution to the currently
active dns server on that machine. The resolver will be configured to only
search the dns for resolution.
To begin the test, start up the first dns server to be tested. For dnscache,
the command "services start" will start it up. For BIND 8, the command "ndc
start" will work. BIND 9 must be started manually. Grab an SSH login into
the dns client and get ready to start testing. After starting up the caching
server, start the monitoring program "watchme.sh". Watchme is a script that
will record the next 5 minutes of activity on the server, outputting the CPU
and memory usage of each server to a file named "watchme.out". We'll then
run a batch of tests and record the output. So, here's exactly what to do:
Server:
# script ; records the session
# services start ; starts up dnscache
# watchme.sh ; monitors server CPU and RAM usage.
Client:
# script
# time runtest-b.sh > b1.out
# time runtest-b.sh > b2.out
# time runtest-b.sh > b3.out
Record the following from watchme.out: starting and ending RAM value, Max
CPU.
Record the following from the time output: real time.
Grep the output files to record: successful queries ("grep = a?.out | wc
-l")
Calculate the accuracy and qps. Accuracy = success / total. Total is the
number of lines in the file iplist-b. Qps (queries per second) = success /
time.
A table is included below to record the data. Once the first run is done for
dnscache, clear the cache server "svc -t /services/dnscache", start the
watchme script and then run the test again on the clients but with the class
A iplist:
Server:
# svc -t /service/dnscache ; restarts dnscache
# watchme.sh ; monitors server CPU and RAM
usage.
Client:
# time runtest-a.sh > a1.out
# time runtest-a.sh > a2.out
# time runtest-a.sh > a3.out
Record the results and then proceed to the next dns server (BIND 8) and
finally for (BIND 9). Follow the exact same tests as above except alter the
commands used to start and stop BIND as documented above.
65,536 queries (Class B) - 1 client - FreeBSD
Cache start end CPU Time success Accuracy qps
RAM RAM
dnscache 100 100 52% 30 65,536 100 2,184
dnscache 100 100 57% 16 65,536 100 4,096
dnscache 100 100 51% 16 65,536 100 4,096
BIND 8 2 8 38% 13 65,536 100 5,041
BIND 8 8 8 82% 30 65,536 100 2,185
BIND 8 8 8 87% 33 65,536 100 1,986
BIND 9 5 10 95% 72 65,536 100 910
BIND 9 10 10 61% 23 65,536 100 2,850
BIND 9 10 10 64% 23 65,536 100 2,850
65,536 queries (Class B) - 1 client - Solaris
Cache start end CPU Time success Accuracy qps
RAM RAM
dnscache 100 100 76% 76 65,536 100 862
dnscache 100 100 80% 33 65,536 100 1986
dnscache 100 100 80% 34 65,536 100 1927
BIND 8 3 9 35% 53 65,536 100 1237
BIND 8 9 9 88% 51 65,536 100 1285
BIND 8 9 9 92% 51 65,536 100 1285
16711680 queries (Class A) - 1 client - FreeBSD
Cache start end CPU Time success Accuracy qps
RAM RAM
Dnscache 100 100 85% 2:14:35 16,777,216 *
2,087
Dnscache 100 100 85% 2:15:41 16,777,216 *
2,071
Dnscache 100 100 85% 2:24:45 16,777,216 *
1,942
BIND 8 2 gok 85% 23:31:00 CRASH * FAILED
BIND 8 * *
BIND 8 * *
BIND 9 5 gok 96% 22:29:55 FAILED * FAILED
BIND 9 * *
BIND 9 * *
**** Solaris testing cancelled due to time constraints.
Once that batch of tests is run, we'll do the exact same series of test
again except with three clients hitting the name servers at the same time.
65,536 queries (Class B) - 3 clients - FreeBSD
Cache start end CPU Time success Accuracy qps
RAM RAM
dnscache 100 100 75 63 196,608 3,121
dnscache 100 100 65 50 196,608 3,932
dnscache 100 100 65 50 196,608 3,932
BIND 8 2 8 85 44 196,608 4,468
BIND 8 8 8 95 86 196,608 2,286
BIND 8 8 8 94 83 196,608 2,369
BIND 9 5 12 95 108 196,608 1,820
BIND 9 12 12 94 69 196,608 2,849
BIND 9 12 12 94 66 196,608 2,979
**** Solaris testing cancelled due to time constraints.
2. Query performance.
For BIND, cache performance is the same is query performance because BIND
reads all the zone files in at start time and serves them from it's cache.
For djbdns, we need to test the query performance of tinydns which will be
serving all the authoritative data. We'll do this by creating a data file
for tinydns that serves all the reverse space within 216.122 and point
dnsfilter at it (just like in the cache testing). Record all the following
data from each run:
Server:
# services start
# watchme.sh ; monitors server CPU and RAM
usage.
Client:
# time dnsfilter < iplist-b > b1.out
# time dnsfilter < iplist-b > b2.out
# time dnsfilter < iplist-b > b3.out
Record the results and then proceed to the next dns server (BIND 8) and
finally for (BIND 9). Follow the exact same tests as above except alter the
commands used to start and stop BIND as documented above.
65,536 queries (Class B) - FreeBSD - 1 client
Server start end CPU Time qps
RAM RAM
tinydns 1 1 72 18 3641
tinydns 1 1 74 18 3641
tinydns 1 1 72 18 3641
65,536 queries (Class B) - Solaris - 1 client
tinydns 2 2 96 159 412
tinydns 2 2 97 159 412
tinydns 2 2 96 159 412
As contrasted by dnscache, tinydns isn't significantly affected by logging.
Whereas by default dnscache chewed through disk space like it was a blue
light special, tinydns is fairly lightweight. I first ran the tests with
logging on (because I forgot to turn them off) and the times for the runs
were at 19 seconds each. With logging off, it dropped to 18 seconds. Full
logging with tinydns is not as expensive as I had guessed.
As we can see, tinydns doens't run well on Solaris at all. It's screams
along on FreeBSD so I'm guessing that it's due to FreeBSD's filesystem
caching being more agressive than Solaris's.
More information about the bind-users
mailing list