BIND 9.3.5-P1 random UDP src ports: some DNS responses delivered to wrong process

Fri Jul 18 11:58:01 UTC 2008

Mark Andrews wrote:
>> After upgrading to BIND 9.3.5-P1, I'm seeing some DNS responses
>> arriving at my host being misdirected to other processes (not named)
>> running on my host.
>>
>> It appears to be because when named needs to send a query and chooses
>> a random UDP source port,
>> it's able to bind() that port even though the port's already in-use.
>>
>> --
>>
>> My platform is Solaris 10 on SPARC.
>>
>> I have a RADIUS server already bound to IPv4 INADDR_ANY UDP port 1812,
>> specifying SO_REUSEADDR.
>>
>> named is running without specifying any 'listen-on'  or 'query-source
>> address'.
>> I see that when it sends a query and chooses a random UDP source port
>> 'x'
>> it binds the socket (which is waiting for the DNS response) to IPv4
>> INADDR_ANY UDP port 'x',
>> specifying SO_REUSEADDR.
>>
>> Sometimes 'x' happens to be 1812.
>> Solaris 10 allows this second bind() to IPv4 INADDR_ANY UDP port 1812
>> to succeed;
>> I assume that's because of the SO_REUSEADDR.
>>
>> When the DNS response to the query arrives, Solaris may deliver it
>> to the RADIUS server; I can confirm that because my RADIUS server
>> logs these packets as malformed in various ways.
>>
>> (I imagine that the converse may also be true; that some of the
>> packets sent by RADIUS clients to
>> the RADIUS server  may instead be delivered to named, but am not
>> running named at a high enough
>>     
> logging level to confirm that.)
>   
>> For a long-running UDP-based server running on a fixed UDP port,
>> I see I can work around this using named's new 'avoid-v4-udp-ports'
>> option.
>> But I imagine that won't solve the problem in general;  there may
>> be other UDP servers (say RPC-based servers) that pick ephemeral UDP
>> ports each time they start;
>> I can't specify those ports in named's 'avoid-v4-udp-ports' option.
>>
>> Have I missed something here?
>> (Is it right for BIND to specify SO_REUSEADDR when it binds a socket
>> it will use for a UDP query with a random UDP source port?)
>>     
>  
> 	You will have the problem even without SO_REUSEADDR.
>
> 	<explict-address>.<port> and <0.0.0.0.>.<port> don't collide.
>   
Ouch...  Would it perhaps help if named tried  <0.0.0.0.>.<port> first.  
And then if that didn't collide it could then bind to 
<explict-address>.<port>.

Regards,
Stace
> 	Named doesn't just call bind(0.0.0.0#0) as many systems
> 	don't do good random port selection.  Lots of systems are
> 	sequential.  Linux keeps handing out the same port as long
> 	as it is not in use then sequentially increments it.
>
> 	If you can, give named its own address.
>
> 	Explicitly binding the query source will help in some, but
> 	not all cases.  If you are running named on a NAT I would
> 	bind to the internal address and have all the queries go
> 	through the NAT process.  Note this depends on how NAT is
> 	implemented.
>
> 	Very few applications use UDP ports as fast as named now
> 	does and kernels really are not tuned to handle it.
>
> 	For what it is worth named has code do deal with responses
> 	to queries that are made on a 0.0.0.0#53 but arrive of a
> 	socket listening for queries.  The kernel does not have
> 	enough information to deliver the UDP message to the right
> 	socket.
>
> 	This can all be avoided if everyone signs their zones.
>
> 	http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf
>
> 	This could also have beeen avoided if everyone implemented
> 	BCP38 to the best of their abilities.
>
> 	Mark
>