Meet an ISC Engineer - Tom Křı́žek!
Each quarter we like to profile one of our engineers, to acquaint our customers and users with some of the people they may interact with as they use our software.Read post
One of the common complaints we’ve received over the years about BIND 9 is that large authoritative servers, particularly those with a very large number of small zones, are slow to launch. I’ve met some DNS operators who expressed a powerful aversion to upgrading their systems, because a shutdown and restart can literally take all day.
If that describes you, I have some good news. There is a simple optimization for BIND 9 that can dramatically improve your startup performance. New versions of BIND are being released soon to take advantage of it.
I recently did some profiling experiments on a server with tens of thousands of small zones, and discovered that the delay was not, as I had expected, primarily caused by loading the server configuration and the zone database. In fact,
named was spending the vast majority of its time repeatedly walking very long linked-lists. Further examination to find the reason for this revealed a simple but significant tuning bug that’s been overlooked for years: The zone tasks were massively overburdened.
In some ways, BIND 9 is almost like a miniature operating system. From the perspective of your real OS,
named is just a single process… but within
named, there are more processes, all taking turns doing their jobs, then yielding control to the next miniature process. These internal mini-processes are called “tasks,” and they handle all the functions of the name server - sending queries, answering queries, cleaning the cache, and so on.
Each zone served by a BIND 9 server has a task associated with it, whose job is to do all the routine maintenance for an authoritative zone: sending SOA requests to masters, sending NOTIFY messages to slaves, dumping dynamic zone data to disk, regenerating expiring DNSSEC signatures, and so forth. Since these functions don’t usually all happen at once, a single task can support many zones; too many zones and the task can be overwhelmed.
It turned out that the pool from which the zone tasks were assigned was fixed in size, and much too small. And the damage this did to startup performance was immense: On a test server with 8 processors and 12G of memory, a server with a million zones took well over ten hours to begin serving queries. And no wonder, because those million zones were sharing the resources of only eight zone tasks.
When I tried increasing the size of the task pool, I expected to see a reduction in startup time. What I didn’t expect was a near elimination of startup time:
named started up and began serving queries in a little over fifteen minutes, most of which was spent parsing the very large named.conf file. Loading the zones, a process that had taken over ten hours in the previous run, now took 2-3 minutes.
A larger task pool does take more memory, but it’s negligible compared to the size of the zone data. If you serve hundreds of thousands or millions of zones, you can expect to see a factor-twenty improvement in startup time at the cost of about 2% more memory.
The single change to be made is in the file lib/dns/zone.c. When the function isc_taskpool_create() is called, the third argument — set to 8 in most versions of BIND — should be set to a number that’s roughly one one-hundredth of the number of zones you expect to be serving. (There is also a slight theoretical benefit if the number happens to be prime, though in practice the difference is quite small.)
If you’re running a million zones, you want about ten thousand zone tasks. 10007 happens to be prime. Changing the 8 to 10007 should dramatically improve your startup performance:
--- zone.c.00 2011-07-12 08:56:34.000000000 -0700 +++ zone.c 2011-07-12 14:46:44.000000000 -0700 @@ -12455,8 +12455,7 @@ zmgr->transfersperns = 2; /* Create the zone task pool. */ - result = isc_taskpool_create(taskmgr, mctx, - 8 /* XXX */, 2, &zmgr->zonetasks); + result = isc_taskpool_create(taskmgr, mctx, 10007, 2, &zmgr->zonetasks); if (result != ISC_R_SUCCESS) goto free_rwlock;
Better yet, though, don’t bother editing C files, and just install the newest releases of BIND 9. In the upcoming 9.8.1, which will have its third beta release this week,
named counts the zones at startup time and automatically scales its zone task table accordingly.
The upcoming 9.6-ESV-R5 and 9.7.4 releases were already very close to final release when this trick was discovered. Since they’ve already been through beta, we decided we’d make a smaller, less-invasive change in those. When the final releases come out in the next week or so, you’ll be able to set an environment variable — BIND9_ZONE_TASKS_HINT — with your desired number of zone tasks.
In later releases of 9.6 and 9.7, we will backport the automatic scaling code, and the environment variable will no longer be necessary.
What's New from ISC