dhcpd stuck in recover mode

Tue Sep 13 19:25:39 UTC 2011

First, don't use /32 subnet declarations. Instead, accurately describe the network topology. Otherwise, dhcpd can do weird things sometimes.

Second, OMAPI and failover are unrelated.

Third, I've seen the behavior you're describing. You can force normal state through omshell. Just open the failover relationship and set the state to normal (3, as I recall). I can't be more specific than that because it's been a long time since I've had to do this by hand -- the BlueCat Adonis platform has an admin command to force normal state, which runs a script that does exactly what I'm describing. Note that the omshell command will affect both DHCP servers, but it doesn't always work on the first try. If it doesn't work from one server, try it from the other server.

Regards,
Chris Buxton
BlueCat Networks

On Sep 13, 2011, at 2:19 AM, Tobias Winter wrote:

> Hi list,
> 
> we are using the isc-dhcp-server package from debian squeeze 64bit in a
> OMAPI failover setup.
> 
> The servers are configured to have one shared network in which the
> following is configured:
> - one subnet with a pool of ips for unknown computers
> - some /32 subnets for already known computers with fixed ips
> 
> The pool of ips for unknown computers has the failover peer flag added
> so it is balanced across both dhcpds and no conflicting dhcp leases will
> happen.
> 
> After new hosts are known, they will get a fixed ip and need to be added
> to the dhcpd.conf. After recreating the config files, both dhcpds are
> restarted after each other. first the primary one, then the secondary one.
> 
> Normally the secondary detects that the peer dhcp disconnected, it moves
> from normal to communications interrupted, then it detects that the peer
> moves from normal to normal and will also move from
> communications-interrupted to normal. After that, balancing of the
> failover pool takes place and the setup is once again in a consistent state.
> 
> My problem is, that sometimes both servers get stuck in recover or
> recover-done mode and will not jump to a normal mode of operations.
> If that is the case, they will happily continue to serve the fixed ips
> for known hosts, but will no longer hand out dynamic ips from the
> failover pool since they are in recover mode.
> 
> Is there any way to see why the server hangs in recover or recover-wait?
> 
> Is it possible to force the dhcpd to normal mode again?
> 
> Can the log verbosity be increased to give better insight as to what is
> going on while they are trying to recover?
> 
> I went through the mailing list archive and googled quite a lot, but I
> can't seem to get something useful out of it. Any help would be greatly
> appreciated.
> 
> - Tobias
> 
> 
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users