The first release of BIND 9 was in September 2000. In the intervening 16 years, we have issued 225 more releases, give or take a few. We have continuously added new additional features and RFCs.
BIND 9 is a big project: at last count there were 691,554 lines of code* in BIND. That is 3 times the size of PowerDNS, 5 times the size of the Unbound resolver, and 6 times the size of the Knot authoritative server. According to the Cocomo model, the BIND 9 codebase is estimated to represent 138 person-years worth of software development effort.
BIND 9 is complex
As we have added to the original BIND 9 over the years, the code has gotten increasingly complex. This complexity has made it difficult and error-prone to modify. Since we cannot test all the code paths in some of the more complex areas of the code, we may introduce new bugs inadvertently. External developers tend to be limited to working on only the less-complex areas of the code, and even the core team is reluctant to modify some logic.
We tried and failed to do a complete rewrite of BIND already, through the BIND 10 project. Recently, inspired by the alarming experience of spending several weeks trying to pinpoint the source of a severe bug in particularly complex part of BIND, we have decided to start gradually refactoring BIND 9. The goal is to rationalize and simplify some of the most complex functions to improve maintainability. In the process, we hope to also improve quality, and in some areas, remove performance bottlenecks.
How complex is BIND 9?
The McCabe Cyclomatic Complexity Index is one well-known measure of the complexity of a function. This measures how many different paths there are through the code. As a general rule, if C is the complexity, then:
- C < 10: Function is easy to maintain
- 11 <= C <= 20: Function is harder to maintain
- C > 20: Function is a candidate for refactoring
|number of functions in BIND 9 with C > 20
||number of functions in BIND 9 with C > 50
||number of functions in BIND 9 with C > 100
Witold Kręcicki, the BIND developer who proposed this refactoring project, devised an index to measure the overall complexity of a software system, based on how many complex functions it has.
WPK Maintainability Index
This index measures how many functions need refactoring, and indicates how deep that refactoring needs to be.
He looked at the complexity of several other software systems, including two from ISC, and two newer DNS systems. In the chart below, lower numbers indicate a need for more extensive refactoring. Overall, BIND is more complex than Kea, Knot, or PowerDNS and less complex than ISC DHCP. This correlates with what we know about the maintainability of those projects.
Validation of complexity vs CVEs
We also checked to see where our worst bugs, the critical defects that are published as CVEs, are located. We expected to see some correlation with high complexity code, and we found it.
In the past 2 years ISC has published 21 CVEs in BIND 9. 18 of those were in current public versions of
named (2016-2848 was in an unsupported version that was shipped in some operating system distributions, 2016-2775 was in lwresd, 2016-1284 was in our subscription branch only).
Out of these 18 bugs, 13 were in overly complex functions (cyclomatic complexity > 20). 10 were in very complex functions (> 60) (see table below for more details).
Resources for refactoring
Our current plan is to refactor three major functions and files in our next major release, BIND 9.12. We estimate this will consume around 25% of our BIND development resources. This means, one engineer will be dedicated to refactoring while the remainder of the team focuses on fixing bugs, supporting users, and adding new features.
We will target the most complex functions which we know are frequently exercised code paths, where we have a lot of demand for new development. The goals for each function will vary slightly, but overall, the objectives are:
- make the complex functions more modular
- rationalize functions that were layered on top of other features
- reduce complexity for resulting functions to McCabe 50 or less
- assess and improve test coverage
Refactoring targets for BIND 9.12
Our initial targets for refactoring include:
- RPZ - Response Policy Zones. This feature enables a “DNS firewall” function, by enabling re-writes of responses for some queries. This is an area of frequent enhancement requests but it has also been the source of some serious bugs. The design goal is to disentangle the RPZ implementation from rbtdb.c (the core red black tree data structure of BIND) so that RPZ policy structures are updated similar to catalog zone processing in 9.11. This will reduce scope for any new bugs, and make RPZ more readily understandable and maintainable.
- query_find(). This function has an incredible complexity index of 453 before refactoring. Refactoring query_find will mean refactoring all of query.c. The goal is to create a central structure to encapsulate the state and pare off any functions that can be externalized, passing state back and forth. This will result in a number of simpler, easier to understand (and more easily testable) functions.
- resquery(), answer_response(), noanswer_response(). This is the source of CVE-2016-9131 and CVE-2016-9147. Each of these is fairly complex (resquery = 154, answer response is 74, noanswer_response is 79). Refactoring these functions will mean rewriting resolver.c.
We hope to complete these three and release them by the end of 2017 in BIND 9.12. After we finish that, we hope to be able to continue with refactoring. A few top candidates for 2018 are:
- Socket code BIND was written for a single CPU.
named has one thread running on one core, and receiving all the requests, and it gives each job to a worker. This causes a lot of context switches, and moving work from one core to another core is costly. The idea is to have multiple listeners, one on each core. This will require redesign of the handling of the incoming connections.
- rbtdb.c. The BIND red black tree database is the main data store in BIND. We would like to split it into separate functions for cache vs. zone storage.
The funding challenge
We are limited in what we can do by our funding model: we are funded primarily by support revenues from users who subscribe to annual support contracts. These users are paying for priority action on bugs that impact them, attention to feature requests, and troubleshooting and diagnostic help. While they will certainly welcome improved code quality, the reality is that everyone would like someone else to fund that. In addition, refactoring is going to mean putting new feature development in some areas on hold, while we are re-writing functions those features will use.
So, while we plan to dedicate 25% of our BIND resources to refactoring, we may have to modify that plan if we can’t find funding for refactoring.
Our first-year goals are to refactor or redesign a small percentage of what needs to be refactored. We hope to be able to continue the refactoring, and deepen it to include removing obsolete features and associated code, in coming years. If we can do this, we can rejuvenate BIND and prolong its relevance for another decade.
Underwriting the refactoring effort
If your organization would like to support this BIND refactoring effort, please contact firstname.lastname@example.org to discuss making a donation. Individual donors, consider making a donation to ISC and mentioning “refactoring” in your comments.
* All figures provided for Lines of Code include blanks and comments.
Appendix - Source of recent CVEs vs code complexity
||Most complex fn in bugfix