DHCP failover problems - still

Robert Blayzor rblayzor.bulk at inoc.net
Tue May 12 15:01:06 UTC 2009


On May 7, 2009, at 5:25 PM, David W. Hankins wrote:
> We have at least two bugs with failover reconnection attempts, of
> which one has a fix comitted to maintenance branches (will be in
> 4.1.1-next, 4.0.2b1, etc), and the other is being reviewed for
> inclusion.

This, at least in our situation, does not get to the root of the  
problem.  The reconnection and keepalives are certainly an issue when  
the links separate between two peers.  In our situation however, we  
notice that active servers (with several messages a second between  
them) just suddenly think that each one has stopped responding.   
Usually one logs the other has timed out, disconnects and then 20  
seconds later the other one logs a timeout.  All this happens even  
when all other network connectivity between the two servers is fine.   
Then of course, they never reconnect again.


> Failover will claim a link has been disconnected if it is idle (no
> received messages) for more than the configured max-response-delay
> (default 20 seconds), or if the socket has been disconnected.  A
> contact message is used at 1/3rd the max-response-delay to keep the
> socket from going idle.



Is this something new in only 4.x ?  We don't think we've seen this in  
3.1.2 ?

-- 
Robert Blayzor, BOFH
INOC, LLC
rblayzor at inoc.net
http://www.inoc.net/~rblayzor/






More information about the dhcp-users mailing list