Kaminsky Vulnerability Mailing List FAQ
Steven Stromer
filter at stevenstromer.com
Thu Jul 31 21:36:20 UTC 2008
PLEASE NOTE: I am in no way related to the ISC or any other DNS
authority... just a user who wants to play his part in helping to
work toward the resolution of this vulnerability! And, I also want to
thank the ISC for its Herculean efforts to protect us all! This FAQ
started as an internal document that ballooned. Hope someone else
finds this helpful.
*********************************
KAMINSKY VULNERABILITY MAILING LIST FAQ
*********************************
It seems to me that the traffic on the bind mailing list has exploded
since word about the Kaminsky vulnerability was posted. I've
literally had a hard time managing my inbox. I've also noted that
this may be affecting others, as repeat questions have become kind of
prevalent. I've put together this compacted summary of pertinent
topics that have hit the list in the last month, hoping to make it
easier to find answers to some of the most FAQs regarding the
vulnerability, and its prevention. I have not included much
attribution, honestly, because I don't have sufficient time - I hope
that no one takes offense... Besides, a quick search will make it
pretty easy to find the original sources of any assertion contained
herein. Also, I think we all know who the power hitters in this forum
are! Please don't hesitate to post corrections to any errors I may
have introduced by attempting to create this FAQ!
Regards,
Steven Stromer
IMPORTANT: ISC has announced that -P2 versions of the patch will be
made available at the end of the week of July 28, 2008 (We wait with
bated breath!). These upgrades promise to correct many (but, not
necessarily all) of the bugs reported in -P1 patches and in present
beta versions. Until the -P2 patches are available, it is recommended
by the ISC that the P1 patch be applied, despite the potential
performance hit on high volume servers (>10k queries/sec). -P2
patches will not contain port customization capabilities (see below
for brief description) promised in 9.5.1 version. I will refer to
this announcement as 'the -P2 announcement' where applicable. For
full announcement, see:
http://marc.info/?l=bind-users&m=121726908015389&w=2
ALSO NOTE: This summary does not address Windows-specific issues. Sorry.
*************************
WHICH SERVERS ARE AT RISK
*************************
Installations of BIND are at risk. Some other dns servers have always
randomized source ports, and are not likely at risk.
If your server is authoritative only, it is not at risk.
If it is recursive, it is at risk. No ifs, ands, or buts about it.
**********************************
CHANGING NATURE OF BIND PORT USAGE
**********************************
TCP
All BIND versions use high, random ports for TCP connections. The
Kaminsky vulnerability doesn't affect TCP queries. And, no, DNS
cannot be turned into a TCP only system.
UDP
The Kaminsky vulnerability does affect UDP queries. Prior to 9.5.0,
BIND chose a high, random UDP *source* port on startup and used that
for the life of the process for outbound queries. 9.5.0 improved that
by choosing from a small pool of 8 ports, and randomly changed ports
every 15 minutes, but it contains no feature for customizing the
ports used without altering source code. The -P1 versions (containing
Kaminsky vulnerability fix) introduce a per-query randomization
across all available high ports (1024-65535).
The present betas (9.5.1b2) and (9.4.3b1), following on the -P1
versions, allow fine-grained control for the UDP ports used. This
fine-grained control is not yet completely fleshed out; ISC docs
explain the developing nature of these controls, which include use-v4-
udp-ports, use-v6-udp-ports, net.inet.ip.portrange.hifirst,
net.inet.ip.portrange.hilast, and certain sysctl tunables. Important
port selection considerations include a) permitting at least 16384
ports, for 14 bits of entropy, to obtain desirable amount of port
randomization, and b) picking a range that will not interfere with
ports required by other running services. Note that the queryport
options are obsoleted in 9.5.1 which uses a random source port for
every query.
DESTINATION PORT
All DNS queries will continue to have a destination of port 53.
********************************
MINIMUM MANDATORY UPGRADE STEPS
********************************
1. Update bind to the -P1 release with your present branch:
(NOTE: Please see the -P2 announcement, above, as it pertains to this.)
9.3.x -> 9.3.5-P1
9.4.x -> 9.4.2-P1
9.5.0 -> 9.5.0-P1
2. Confirm that either there is no 'query-source port ##' statement
in named.conf, or, if it does exist, that it is set to 'port *'.
3. Open unprivileged UDP ports on firewall.
4. Test to confirm port randomization.
***********************
FIREWALL CONSIDERATIONS
***********************
Most modern firewalls have an option to do "udp keep state". In an
ideal world, use this option for dns activity on unprivileged ports.
Alternatively, open all UDP ports >1024 to the name server's IP
address. In practice this should not be a problem if no other
services requiring these ports is running. If you do, use the
combination of avoid-*-udp-ports in named.conf and firewall rules to
block those specific ports, and allow all the others. In the various
beta versions of BIND there is an option to specify a range of ports
for named to use for outgoing UDP queries (see above), which should
make it easier to configure the firewall.
Confirm that NATing firewalls are not rewriting source addresses on
outbound dns queries, or all your patching and port randomization
will be for naught. This is usually configurable, but confirm it's
not occurring. Cisco FWSM and various PIX releases are being reported
as particularly troublesome in this regard.
It is also possible to use iptables firewall rules to (further)
randomize the source port. SEE:
http://cipherdyne.org/blog/2008/07/mitigating-dns-cache-poisoning-
attacks-with-iptables.html
Or, to do the same with pf:
http://blog.spoofed.org/2008/07/mitigating-dns-cache-poisoning-with-
pf.html
**********************************
TEST TO CONFIRM PORT RANDOMIZATION
**********************************
DNS-OARC and Doxpara Tests would previously pass nameservers with
even 9.5.0 level port randomization, though this now seems to have
been fixed. It is still wise to confirm that more than 8 (16?) ports
are being used.
-------------------------------
TEST OPTION #1: DNS-OARC at CLI
-------------------------------
At command prompt:
dig +short porttest.dns-oarc.net TXT
To test a specific nameserver, at command prompt:
dig @<ip_of_nameserver> +short porttest.dns-oarc.net TXT
Response will include one of the following ratings:
Rating Standard Deviation Bits of Entropy
GREAT 3980 -- 20,000+ 13.75 -- 16.0
GOOD 296 -- 3980 10.0 --13.75
POOR 0 -- 296 0 -- 10.0
Note the standard deviation shown at the end of the response - you
want 5 digits before the decimal point.
-------------------------------
TEST OPTION #2: DNS-OARC on Web
-------------------------------
Visit:
https://www.dns-oarc.net/oarc/services/dnsentropy
Click 'Test My DNS'
-------------------------------------
TEST OPTION #3: Other Web-based tests
-------------------------------------
http://www.doxpara.com
http://member.dnsstuff.com/tools/vu800113.php
I'm sure this list is not comprehensive...
---------------------------------
TEST OPTION #4: PERL based tester
---------------------------------
Download at:
http://michael.toren.net/code/noclicky/noclicky-1.00.pl
Download patch to same directory as perl script, for more accurate
results:
http://www.smtps.net/issues/01-noclicky.patch
Apply patch:
$ patch -p0 <02-noclicky.patch
Run the perl script.
------------------
ADDITIONAL TESTING
------------------
In addition to running one of the above tests, run tcpdump and
confirm it is showing multiple ports on queries.
***********************************
DOCUMENTED ISSUES WITH -P1 VERSIONS
***********************************
(NOTE: Please see the -P2 announcement, above, as it pertains to this.)
There appear to be a number of threading issues appearing when
applying some -P1 patches to certain OS/hardware combinations. There
also appear to be some configuration settings that reduce these
errors, in some instances. Betas of the next versions, applied to
avoid the -P1 issues, introduce another set of bugs/errors (see below).
If you are concerned about upgrading a busy server, Jinmei Tatuya
(ISC) has created a testing tool to help pre-determine whether your
server will experience problems. The test tool is available at:
http://www.jinmei.org/selecttest.tgz
For more info, see:
http://marc.info/?l=bind-users&m=121745487721871&w=2
At this time, ISC is recommending that everyone stay within their
branch and move to the -P1 releases (ie, if you are at 9.3.x, move to
9.3.5-P1, 9.4.x users should move to 9.4.2-P1). Unless you have a
definitive need to run the beta code, they recommend remaining with
the -P1 releases to reduce the number of changes that are being
introduced into your environment.
----------------------------
'TOO MANY OPEN FILES' ERROR:
----------------------------
NOTE: Seems to be affecting those running 9.5.0-P1. This is not
necessarily a fatal error, but will return errors to clients.
LOGGED AS:
general: error: socket.c:1996: unexpected error:
general: error: internal_accept: fcntl() failed: Too many open files
OR:
named[xxx]: [ID xxxxx daemon.error] general: error: socket: too many
open file descriptors
EXPERIMENTAL SOLUTIONS:
1. Increase file-descriptors default from 256 to 4096 for 32-bit
apps, or to 65535 for 64-bit apps.
Set the #define __FD_SETSIZE in /usr/include/linux/posix_types.h to
4096, save, and recompile.
2. Increase max-cache-size to 512M.
3. Edit /etc/security/limit.conf to allow for processes to have more
open files:
cat /proc/sys/fs/file-max to see what the kernel thinks is the max
number of processes
Edit /etc/security/limits.conf as follows:
* - nofile 16384
SEE: http://kbase.redhat.com/faq/FAQ_80_1540.shtm
4. tcp-clients and tcp-listen-queue to 1000 (seemingly discredited
solution)
5. Increase ulimit in limits.conf:
Use ulimit -n to see how many open files you currently allow.
Edit /etc/security/limits.conf as follows:
* soft nofile 16384
* hard nofile 16384
(This changes the limits for everything, but if on dedicated
nameservers, this isn't an issue.)
************************************
DOCUMENTED ISSUES WITH BETA VERSIONS
************************************
----------------------------
'ASSERTION FAILURE' ERROR:
----------------------------
LOGGED AS:
#general: resolver.c:5494: REQUIRE((((query) !=
0) && (((const isc__magic_t *)(query))->magic == ((('Q') << 24 | ('!')
<< 16 | ('!') << 8 | ('!')))))) failed
#general: exiting (due to assertion failure)
EXPERIMENTAL SOLUTIONS:
1. ISC is asking that willing beta testers experiencing this error
apply the following patch to help capture detailed debugging info:
http://www.jinmei.org/bind-9.4.3b2-dispatch.diff
and:
http://www.jinmei.org/patch/dispatch.c.diff
2. Possible temporary solution is to recompile beta versions from
source without threads (opens door to possible performance degradation).
2. Increase ISC_SOCKET_MAXECENTS to 12.
----------------------------------
ANOTHER 'ASSERTION FAILURE' ERROR:
----------------------------------
LOGGED AS:
named[xxxxx]: socket.c:1736: INSIST(!sock->pending_recv) failed
named[xxxxx]: exiting (due to assertion failure)
ISC REPORTS:
We've [...] a fix to that in our development tree (btw: this bug is
irrelevant to the recent port
randomization change). Unfortunately the fix won't be in the next
patch version (P2)[...]
--------------------------------------------------
'MAXIMUM NUMBER OF FD EVENTS (64) RECEIVED' ERROR:
--------------------------------------------------
NOTE: Seems to be affecting bind-9.4.3b2 (Possibly isolated to
Solaris...)
LOGGED AS:
general: sockmgr: maximum number of FD events (64) received
EXPERIMENTAL SOLUTIONS:
Increasing ISC_SOCKET_MAXEVENTS from 64 to 128 seems to reduce
frequency of warning.
----------------------------
'BAD FILE HANDLE' ERROR:
----------------------------
EDITOR'S NOTE: Mentioned in list, but could not locate any
documentation...
********************************
REDHAT'S RESPONSE TO THE VULNERABILITY
********************************
RedHat's advisory is at: http://rhn.redhat.com/errata/
RHSA-2008-0533.html
Advisory does not clearly specify whether patches only remove the
query port restriction out oftheir sample named.conf, or whether the
full code update needed to make do real port randomization has been
implemented. dig test should confirm its activity. Advisory details
that updates to selinux-policy packages permit port randomization, so
update these, too.
*******************************
HANDLING OLDER VERSIONS OF BIND
*******************************
OPTION #1: UPGRADE!
OPTION #2: For the moment, set older, unpatchable servers to use a
newer server as a forwarder.
****************************************
DNSSEC AS PART OF A LONGER TERM SOLUTION
****************************************
Implement DNSSEC. This may be more practical for busy admins once 9.6
is released. Until then, zones need to be resigned on a regular
schedule, manually. Another consideration is security; implementation
basically reveals the full contents of a zone, though this is also
supposed to be resolved in 9.6.
EDITOR'S COMMENT: It seems that the discovery of this vulnerability
has reinforced the need for universal implementation of dnssec. 9.6
seems to promise a lot. I, for one, look forward to its release, and
hope that all pertinent and available resources are being thrown at
its completion!
Good implementation instructions are at:
http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf
Posted corrections to this document (Version 1.4 contains these
corrections):
On page 31: dnssec-keygen -a rsasha1 -b 4096 -n ZONE -k KSK zonename
Should be: dnssec-keygen -a rsasha1 -b 4096 -n ZONE -f KSK zonename
On page 49: dlv.isg.org. 3 257 "BEA[...]gDB";
Should be: dlv.isc.org. 257 3 5 "BEA[...]guDB";
Also SEE:
http://alan.clegg.com/dnssec
*******************************************************************
CACHE SNOOPING IS A RELATED, STILL UNADDRESSED VULNERABILITY IN DNS
*******************************************************************
Chris Buxton, of Men & Mice, provided a fantastic explanation of this
that all should read:
http://groups.google.com/group/comp.protocols.dns.bind/msg/
b6c67170b468d693
More information about the bind-users
mailing list