- By Adib Behjat on August 31, 2011
Yesterday afternoon, ISC published the first alpha release of BIND 9.9.0. This is an early technology preview, showing off some of the work we’ve been doing in BIND 9.
There will be more new features added in later alpha releases, but here’s what’s ready to debut now…
The big new item in 9.9.0a1 is NXDOMAIN redirection. This enables a resolver to respond to a client with locally-configured information when a query would otherwise have gotten an answer of “no such domain”. This enables a recursive operator, for example, to provide alternate suggestions for misspelled domain names. (Names that are in DNSSEC-signed domains are exempted from this when validation is in use.)
In addition to the start-up performance improvements that have already been released in BIND 9.8.1, BIND 9.9.0 improves query throughput on multi-processor systems by allowing multiple threads to listen for and process incoming queries.
There have been several changes to improve DNSSEC usability:
- Zones that were signed automatically via the ‘auto-dnssec’ option used to use NSEC by default, but could be switched to NSEC3 afterward. It is now possible to set NSEC3 parameters in these zones before they sign, skipping the NSEC3 step entirely.
- The ‘dnssec-signzone -D’ option causes dnssec-signzone to write DNSSEC data to a separate output file. This allows you to put “$INCLUDE example.com.signed” into the zonefile for example.com, run ‘dnssec-signzone -SD example.com’, and the result is a fully signed zone which did *not* overwrite your original zone file. Running the same command again will incrementally re-sign the zone, replacing only those signatures that need updating, rather than signing the entire zone from scratch.
- The ‘dnssec-signzone -X’ option allows signatures on DNSKEY records to have a different expiration date from other signatures. This makes it more convenient to keep your KSK on a separate system, and resign the zone with it less frequently.
- A new ‘-L’ option to dnssec-keygen, dnssec-settime, and dnssec-keyfromlabel sets the default TTL for the DNSKEY record.
- The ‘dnssec-signzone -R’ option forces removal of signatures that are not expired but were created by a key which no longer exists.
- dnssec-dsfromkey can now read from standard input, making it easier to convert DNSKEY records to DS. For example, to get the DS records for isc.org, all that is necessary is:
dig dnskey isc.org | dnssec-dsfromkey -f - isc.org
(And stay tuned for BIND 9.9.0a2, which will include a new ‘inline-signing’ option, allowing “bump in the wire” signing.)
- The ‘also-notify’ option now takes the same syntax as ‘masters’, so you can use named lists of servers, and specify TSIG keys.
- The ‘serial-update-method’ option allows dynamic zones to have their SOA serial number set to the current UNIX time if desired, rather than simply incrementing the serial number with each change to the zone.
- The ‘rndc flushtree’ command clears all data under a given name from the DNS cache. So, for example, “rndc flushtree example.com” will not only remove example.com from the cache, but also www.example.com, mail.example.com, and so on.
- The ‘rndc sync’ command dumps pending changes in a dynamic zone to disk without having to go through a freeze/thaw cycle.
- ‘rndc freeze’ and ‘rndc thaw’ no longer remove the zone’s journal file; this makes it possible to use ixfr-from-differences with a dynamic zone.
- By Adib Behjat on August 18, 2011
Our talk line-up this time includes presentations by ISC on Passive DNS and our new Knowledge Base, World IPv6 Day retrospectives, the BufferBloat project, IOS-XR experiences, BT’s LTE trial, NOC services, PGP keysigning, and the RIPE arbitration process.
Twenty meetings in, it’s worth pausing to look back at UKNOF’s origins, and where we are headed in the future…
It is about a decade since I had the idea that the British Internet Provider and Network Operator community had become a bit fragmented, and was losing out two ways: from too much information only being shared behind closed doors, and from not enough “clue” about international best practice being absorbed from operators in other countries.
I decided the solution was a national version of forums like NANOG, but scaled down to UK size. Indeed, groups like SwiNOG had already been set up elsewhere in Europe and were doing well. We came up with a Charter for the UK Network Operators’ Forum, which stated its purpose as:
"To act as an open forum for operational, technical and engineering information exchange, co-operation and co-ordination between Internet, Ethernet and other public telecommunications network operators within the United Kingdom."
One of the key UKNOF principles, in the Internet tradition of bodies like IETF and RIPE, is openness – anyone can attend, and all materials and proceedings are in the public domain. This seemed to me to be the best way to do co-ordination and skill-sharing, and this is how we went about our mission of “distribution of clue“.
After some obstacles and false starts, I found the time between positions to get this going in 2005, and it is great that my employer since then, ISC, supports my activity at UKNOF as part of its wider Public Benefit mission.
A key part of UKNOF’s openness remit was to have our meeting costs paid for by sponsors and hence keep the cost of attendance free to all. In an industry that was still reeling from the post-dotcom bust, and where relevant training is scarce and expensive, this proved invaluable.
Six years and twenty successful meetings later, UKNOF is now fully financially self-supporting for its direct costs, and attendance at UKNOF meetings has been steady at nearly a hundred people for some time. Going back to a comparison with NANOG, we have roughly the same ratio of meeting attendees to geographic territory population, so must be doing something right There’s also our extended community of webcast viewers, nearly 800 UKNOF mailing list members, and our LinkedIn and Facebook groups.
So what happens at a typical UKNOF meeting?
During each day-long event, we have a diverse slate of speakers on various topics of network engineering, operations and deployment relevance. Recurring topics include IPv6 deployment, DNS, case studies, Ethernet and optical technology, IP addressing policy, routing protocols, Internet Exchange operator updates, and network monitoring. Sometimes these will be in the format of panels, or a session of short topical “lightning talks”.
Usually we have around a dozen 30 minute talks during the one-day meeting, but there’s also breaks to mingle and network, and often a PGP signing session to build the community’s cryptographic “web of trust”.
We always try and get world-class speakers from outside the UK to share their international experiences, particularly for those UK ISPs who maybe don’t have the budget to attend overseas events. It’s also important to have the meetings in different locations around the country to share what’s going on in the diverse regions, and not always be centered in London just because there’s a lot of infrastructure there.
Our speakers are the life-blood of UKNOF meetings, and our programme could not happen without them preparing and attending on their own time and expense. We are always looking for new speakers, presentations and ideas, if you have any please submit them to the Programme Committee at email@example.com
We’re conscious of UKNOF being an on-ramp for newcomers in the industry, where they can, without any barriers to entry, test the waters, get to know people, learn, and most importantly become the industry’s next generation of fresh talent and ideas. One thing that helps with this is having social events around the meetings, including of course beer and the now-institutional pre-meeting curry (blame the hungry ex-pat !). This is always subject to sponsor generousity, and while an essential icebreaker, we take care to not let the social tail wag the technical event dog.
UKNOF is run on a volunteer not-for-profit basis, and would not have got as far as it has today without the support of many many people. This includes but is not limited to our speakers, our dedicated teams of Programme Committee members and meeting volunteers; various non-profit organisations including in particular LINX, LONAP, Nominet, ISC, RIPE NCC, BBC, and UKIF; as well as Bogons, Jump Networks & Portfast; and our many commercial sponsors and meeting hosts. I’d like to thank them here for all their efforts and contributions. As with HP, our sponsors often come back to support repeat meetings, and new sponsors are always welcome.
Even with committed supporters, growth and success does not just happen, it requires active care and feeding to avoid “success disasters”, and now is a good time to look at how we ensure that UKNOF’s success remains stable and sustainable into the future.
At UKNOF18, I presented a proposal for evolving UKNOF’s governance structure into something a little more formal. While on the one hand our informal arrangements have worked well to date, on the other the financial and time commitments for supporting UKNOF have grown over the years. The idea is to put things on a more stable footing with appropriate oversight, while at the same time not burdening a body, which essentially exists mainly to just run meetings, with too much overhead and complexity.
So far, UKNOF’s finances and secretariat functions have been handled by UKIF Ltd (a public company limited by guarantee), originally set up as an ISP trade association. Over time, supporting UKNOF has dominated UKIF activities, and the proposal is to do a “reverse takeover” of the UKIF legal entity so that to all intents it embodies UKNOF.
Public Companies Limited by Guarantee have members rather than shareholders, and it is proposed that UKNOF have those UK Non-Profit Internet co-ordinating bodies that have helped nurture it as its members. The initial member invitations would be made by UKIF’s existing members who are currently its 3 Directors (which includes myself), and the list could be extended by approval of the existing members in future. All of UKIF’s financial and other assets would become those of UKNOF.
In turn, UKNOF’s members would select a Board of Directors, who would perform fiduciary oversight of its activities, appoint Programme Committee members, and ensure that its finances were conducted appropriately. The Member organisations would in turn be accountable to the community through the diverse stakeholders that support their activities.
To get this process going, it is proposed to have a UKNOF Open Governance meeting to approve the transition plan and confirm the initial slate of Member organisations. This is planned for the afternoon of Thursday 8th September at the BBC in White City, London. If you are interested in UKNOF’s future and would like to attend or volunteer to help, please contact me at firstname.lastname@example.org.
- By Adib Behjat on August 9, 2011
Dual Stack Lite is an architecture that allows IPv4 services to be provided in an IPv6 network, despite a limited amount of available IPv4 addresses. Work on DS-Lite was conducted within the Softwires working group in the IETF, and began in late 2008. After many revisions it was recently published as RFC6333, with its companion RFC6334 dedicated to automated configuration. Both authors of RFC6334 – David Hankins and Tomasz Mrugalski worked or are currently working for ISC.
In a typical environment, the Internet Service Provider (ISP) usually deploys Customer Premises Equipment (CPE), a small home gateway that performs Network Address Translation (NAT), so the customer can connect several devices, e.g. a desktop computer, a laptop and WiFi access point. This approach is very convenient, but has a significant drawback of requiring one IPv4 address for each customer. Due to shortage of IPv4 addresses that approach is very problematic for many operators, especially the bigger ones.
The DS-Lite architecture however, differs from classical IPv4 deployment model. Due to exhaustion of IPv4 address space, nowadays it is impossible to obtain new IPv4 addresses. To share one address between several customers, NAT had to be moved to a different location. Instead of translating packets on the ISP network border, NAT was moved deeper into the ISP network. IPv6 is used as a transport layer between the CPE and NAT. In DS-Lite nomenclature, a CPE performing IPv4 to IPv4-over-IPv6 encapsulation is called the Basic Bridging BroadBand (B4) element. Carrier-grade NAT element located deep withn the ISP network is called Address Family Transition Router (AFTR).
To leverage such an architecture, the B4 element has to learn address of an AFTR that will serve as a tunnel termination point. Manual configuration is not feasible in most cases, therefore an automated method was defined. The best way to deliver necessary information to B4 is using DHCPv6. RFC6334 defined a new option called AFTR_NAME that conveys the fully qualified domain name (FQDN) of an AFTR. The ability to convey a name rather than simply an address offers several benefits. The most desirable is to allow network operators to use a name that resolves to a different address for different clients, thus providing load balancing.
ISC is actively supporting deployment of IPv6 in general. In particular, it is involved in many transition technologies. Dual Stack Lite is one such technology. Both authors of RFC6334 worked or currently are working for ISC. ISC DHCP already allows configuration of this option using a custom option. Dedicated support is planned for ISC DHCP 4.3.
ISC provides an open source, reference implementation of AFTR, as well as instructions for configuring a home gateway as B4. Our engineers are also involved in protocol and implementation development of associated technology called Port Control Protocol (PCP).
- By Adib Behjat on July 12, 2011
ISC BIND 9.8.1b3 is now available.
BIND 9.8.1b3 is the third beta release of BIND 9.8.
This document summarizes changes from BIND 9.8.0 to BIND 9.8.1b3. Please see the CHANGES file in the source code release for a complete list of all changes.
The latest versions of BIND 9 software can always be found on our web site at https://www.isc.org/downloads/all. There you will find additional information about each release, source code, and some pre-compiled versions for certain operating systems.
Product support information is available on https://www.isc.org/services/support for paid support options.
Free support is provided by our user community via a mailing list. Information on all public email lists is available at https://lists.isc.org/mailman/listinfo.
Added a new include file with function typedefs for the DLZ “dlopen” driver. [RT #23629]
Added a tool able to generate malformed packets to allow testing of how named handles them. [RT #24096]
- If named is configured with a response policy zone (RPZ) and a query of type RRSIG is received for a name configured for RRset replacement in that RPZ, it will trigger an INSIST and crash the server. RRSIG. [RT #24280]
- named, set up to be a caching resolver, is vulnerable to a user querying a domain with very large resource record sets (RRSets) when trying to negatively cache the response. Due to an off-by-one error, caching the response could cause named to crash. [RT #24650] [CVE-2011-1910]
- Using Response Policy Zone (RPZ) to query a wildcard CNAME label with QUERY type SIG/RRSIG, it can cause named to crash. Fix is query type independant. [RT #24715]
- Using Response Policy Zone (RPZ) with DNAME records and querying the subdomain of that label can cause named to crash. Now logs that DNAME is not supported. [RT #24766]
- Change #2912 populated the message section in replies to UPDATE requests, which some Windows clients wanted. This exposed a latent bug that allowed the response message to crash named. With this fix, change 2912 has been reduced to copy only the zone section to the reply. A more complete fix for the latent bug will be released later. [RT #24777]
Improved the startup time for an authoritative server with a large number of zones by making the zone task table of variable size rather than fixed size. This means that authoritative servers with lots of zones will be serving that zone data much sooner. [RT #24406]
Merged in the NetBSD ATF test framework (currently version 0.12) for development of future unit tests. Use configure –with-atf to build ATF internally or configure –with-atf=prefix to use an external copy. [RT #23209]
Added more verbose error reporting from DLZ LDAP. [RT #23402] The DLZ “dlopen” driver is now built by default, no longer requiring a configure option. To disable it, use “configure–without-dlopen”. (Note: driver not supported on win32.) [RT#23467]
Replaced compile time constant with STDTIME_ON_32BITS. [RT #23587]
Make –with-gssapi default for ./configure. [RT #23738]
Thank you to everyone who assisted us in making this release possible. If you would like to contribute to ISC to assist us in continuing to make quality open source software, please visit our donations page at https://www.isc.org/supportisc.
Evan Hunt — email@example.com
Internet Systems Consortium, Inc.
- By Adib Behjat on July 12, 2011
One of the common complaints we’ve received over the years about BIND 9 is that large authoritative servers, particularly those with a very large number of small zones, are slow to launch. I’ve met some DNS operators who expressed a powerful aversion to upgrading their systems, because a shutdown and restart can literally take all day.
If that describes you, I have some good news. There is a simple optimization for BIND 9 that can dramatically improve your startup performance. New versions of BIND are being released soon to take advantage of it.
I recently did some profiling experiments on a server with tens of thousands of small zones, and discovered that the delay was not, as I had expected, primarily caused by loading the server configuration and the zone database. In fact, named was spending the vast majority of its time repeatedly walking very long linked-lists. Further examination to find the reason for this revealed a simple but significant tuning bug that’s been overlooked for years: The zone tasks were massively overburdened.
In some ways, BIND 9 is almost like a miniature operating system. From the perspective of your real OS, named is just a single process. . . but within named, there are more processes, all taking turns doing their jobs, then yielding control to the next miniature process. These internal mini-processes are called “tasks”, and they handle all the functions of the name server—sending queries, answering queries, cleaning the cache, and so on.
Each zone served by a BIND 9 server has a task associated with it, whose job is to do all the routine maintenance for an authoritative zone: sending SOA requests to masters, sending NOTIFY messages to slaves, dumping dynamic zone data to disk, regenerating expiring DNSSEC signatures, and so forth. Since these functions don’t usually all happen at once, a single task can support many zones; but too many zones and the task can be overwhelmed.
It turned out that the pool from which the zone tasks were assigned was fixed in size, and much too small. And the damage this did to startup performance was immense: On a test server with 8 processors and 12G of memory, a server with a million zones took well over ten hours to begin serving queries. And no wonder, because those million zones were sharing the resources of only eight zone tasks.
When I tried increasing the size of the task pool, I expected to see a reduction in startup time. What I didn’t expect was a near elimination of startup time:
Named started up and began serving queries in a little over fifteen minutes, most of which was spent parsing the very large named.conf file. Loading the zones, a process that had taken over ten hours in the previous run, now took 2-3 minutes. (Full details of the tests and results can be found here.)
A larger task pool does take more memory, but it’s negligible compared to the size of the zone data. If you serve hundreds of thousands or millions of zones, you can expect to see a factor-twenty improvement in startup time at the cost of about 2% more memory.
The single change to be made is in the file lib/dns/zone.c. When the function isc_taskpool_create() is called, the third argument—set to 8 in most versions of BIND—should be set to a number that’s roughly one one-hundredth of the number of zones you expect to be serving. (There is also a slight theoretical benefit if the number happens to be prime, though in practice the difference is quite small.)
If you’re running a million zones, you want about ten thousand zone tasks. 10007 happens to be prime. Changing the 8 to 10007 should dramatically improve your startup performance:
— zone.c.00 2011-07-12 08:56:34.000000000 -0700
+++ zone.c 2011-07-12 14:46:44.000000000 -0700
@@ -12455,8 +12455,7 @@
zmgr->transfersperns = 2;
/* Create the zone task pool. */
- result = isc_taskpool_create(taskmgr, mctx,
- 8 /* XXX */, 2, &zmgr->zonetasks);
+ result = isc_taskpool_create(taskmgr, mctx, 10007, 2, &zmgr->zonetasks);
if (result != ISC_R_SUCCESS)
Better yet, though, don’t bother editing C files, and just install the newest releases of BIND 9. In the upcoming 9.8.1, which will have its third beta release this week, named counts the zones at startup time and automatically scales its zone task table accordingly.
The upcoming 9.6-ESV-R5 and 9.7.4 releases were already very close to final release when this trick was discovered. Since they’ve already been through beta, we decided we’d make a smaller, less invasive change in those. When the final releases come out in the next week or so, you’ll be able to set an environment variable—BIND9_ZONE_TASKS_HINT—with your desired number of zone tasks.
In later releases of 9.6 and 9.7, we will backport the automatic scaling code, and the environment variable will no longer be necessary.