BIND 9.3.5-P1 random UDP src ports: some DNS responses delivered to wrong process
Stacey Jonathan Marshall
Stacey.Marshall at Sun.COM
Fri Jul 18 13:24:41 UTC 2008
Mark,
On Solaris it appears that they do not collide. A colleague ran this
test for me:
{blu}: sock
usage: sock [ options ] <host> <port>
sock [ options ] -s [ <IPaddr> ] <port> (for server)
-s operate as server instead of client
-u use UDP instead of TCP
-A SO_REUSEADDR option
{blu}: sock -s -u 23456 & (bind *.23456, no REUSE)
[1] 26891
{blu}: netstat -an | grep 23456
*.23456 Idle
{blu}: sock -s -u 23456 (bind *.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 23456 (bind *.23456, REUSE)
^C (succeeds)
{blu}: ifconfig -a | grep 'inet 129'
inet 129.148.226.18 netmask ffffff00 broadcast 129.148.226.255
{blu}: sock -s -u 129.148.226.18 23456 (bind IP.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 129.148.226.18 23456 (bind IP.23456, REUSE)
^C (succeeds)
{blu}: kill %1
[1] Terminated sock -s -u 23456
{blu}: sock -s -u -A 23456 & (bind *.23456, REUSE)
[1] 32279
{blu}: sock -s -u 23456 (bind *.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 23456 (bind *.23456, REUSE)
^C (succeeds)
{blu}: sock -s -u 129.148.226.18 23456 (bind IP.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 129.148.226.18 23456 (bind IP.23456, REUSE)
^C (succeeds)
{blu}: kill %1
[1] Terminated sock -s -u -A 23456
{blu}: sock -s -u 129.148.226.18 23456 & (bind IP.23456, no REUSE)
[1] 33584
{blu}: sock -s -u 23456 (bind *.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u 129.148.226.18 23456 (bind IP.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 23456 (bind *.23456, REUSE)
^C (succeeds)
{blu}: sock -s -u -A 129.148.226.18 23456 (bind IP.23456, REUSE)
^C (succeeds)
{blu}:
So, no REUSE, and the bind will not succeed. With REUSE, it always
succeeds.
Stacey Jonathan Marshall wrote:
> Mark Andrews wrote:
>>> After upgrading to BIND 9.3.5-P1, I'm seeing some DNS responses
>>> arriving at my host being misdirected to other processes (not named)
>>> running on my host.
>>>
>>> It appears to be because when named needs to send a query and chooses
>>> a random UDP source port,
>>> it's able to bind() that port even though the port's already in-use.
>>>
>>> --
>>>
>>> My platform is Solaris 10 on SPARC.
>>>
>>> I have a RADIUS server already bound to IPv4 INADDR_ANY UDP port 1812,
>>> specifying SO_REUSEADDR.
>>>
>>> named is running without specifying any 'listen-on' or 'query-source
>>> address'.
>>> I see that when it sends a query and chooses a random UDP source port
>>> 'x'
>>> it binds the socket (which is waiting for the DNS response) to IPv4
>>> INADDR_ANY UDP port 'x',
>>> specifying SO_REUSEADDR.
>>>
>>> Sometimes 'x' happens to be 1812.
>>> Solaris 10 allows this second bind() to IPv4 INADDR_ANY UDP port 1812
>>> to succeed;
>>> I assume that's because of the SO_REUSEADDR.
>>>
>>> When the DNS response to the query arrives, Solaris may deliver it
>>> to the RADIUS server; I can confirm that because my RADIUS server
>>> logs these packets as malformed in various ways.
>>>
>>> (I imagine that the converse may also be true; that some of the
>>> packets sent by RADIUS clients to
>>> the RADIUS server may instead be delivered to named, but am not
>>> running named at a high enough
>>>
>> logging level to confirm that.)
>>
>>> For a long-running UDP-based server running on a fixed UDP port,
>>> I see I can work around this using named's new 'avoid-v4-udp-ports'
>>> option.
>>> But I imagine that won't solve the problem in general; there may
>>> be other UDP servers (say RPC-based servers) that pick ephemeral UDP
>>> ports each time they start;
>>> I can't specify those ports in named's 'avoid-v4-udp-ports' option.
>>>
>>> Have I missed something here?
>>> (Is it right for BIND to specify SO_REUSEADDR when it binds a socket
>>> it will use for a UDP query with a random UDP source port?)
>>>
>>
>> You will have the problem even without SO_REUSEADDR.
>>
>> <explict-address>.<port> and <0.0.0.0.>.<port> don't collide.
>>
> Ouch... Would it perhaps help if named tried <0.0.0.0.>.<port>
> first. And then if that didn't collide it could then bind to
> <explict-address>.<port>.
>
> Regards,
> Stace
>> Named doesn't just call bind(0.0.0.0#0) as many systems
>> don't do good random port selection. Lots of systems are
>> sequential. Linux keeps handing out the same port as long
>> as it is not in use then sequentially increments it.
>>
>> If you can, give named its own address.
>>
>> Explicitly binding the query source will help in some, but
>> not all cases. If you are running named on a NAT I would
>> bind to the internal address and have all the queries go
>> through the NAT process. Note this depends on how NAT is
>> implemented.
>>
>> Very few applications use UDP ports as fast as named now
>> does and kernels really are not tuned to handle it.
>>
>> For what it is worth named has code do deal with responses
>> to queries that are made on a 0.0.0.0#53 but arrive of a
>> socket listening for queries. The kernel does not have
>> enough information to deliver the UDP message to the right
>> socket.
>>
>> This can all be avoided if everyone signs their zones.
>>
>> http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf
>>
>> This could also have beeen avoided if everyone implemented
>> BCP38 to the best of their abilities.
>>
>> Mark
>>
>
>
More information about the bind-users
mailing list