Failover dhcpd pair stuck in partner-down/shutdown state

Bob Harold rharolde at umich.edu
Wed Jan 2 15:24:20 UTC 2019


On Tue, Dec 25, 2018 at 4:24 AM Eugene Grosbein <eugen at grosbein.net> wrote:

> Hi!
>
> I run two ISC DHCP Servers version 4.3.5 in failover mode.
>
> They have been running just fine for several years being upgraded from
> time to time
> until recently I found that first one runs in "partner-down" state
> and second in "shutdown" state despite of tcp/647 control connection
> in perfectly working state and data running over it according to tcpdump.
>
> They were running in such state for very long time (over a year) and
> I have no old logs to check due to log rotation. At the moment,
> second server added "not responding (shut down)" to
> DHCPDISCOVER/DHCPREQUEST
> lines written to its log.
>
> I tried to resolve the issue by stopping second dhcpd completely
> and starting it again. At start, it wrote to the log:
>
> dhcpd: failover peer default: I move from shutdown to startup
>
> Then it connected its control connection tcp/647 to second server,
> exchanged some data over the connection, appended to dhcpd.leases file:
>
>         failover peer "default" state {
>           my state shutdown at 4 2017/03/30 02:17:13;
>           partner state partner-down at 4 2017/03/30 02:17:13;
>           mclt 60;
>         }
>
> Then it wrote to the log:
>
> dhcpd: failover peer default: I move from startup to shutdown
>
> And things settle again in same state.
>
> Restart of first server did not help either.
>
> I was forced to stop both of servers for short time, manually delete all
> "failover" records quoted above from both dhcpd.leases files
> and start servers again. Only then both servers got to "normal" state
> (editing only one of dhcpd.leases files did not help).
>
> My question: why did servers stuck in partner-down/shutdown state "forever"
> and could not get from it without manual intervention despite of perfectly
> working
> control TCP connection? Is this problem fixed in recent versions?
>
> Here is dhcpd.conf of first server:
>
> # default ports tcp/647
>
> failover peer "default" {
>         primary;
>         address 62.231.191.161;
>         peer address 62.231.191.174;
>         max-response-delay 60;
>         max-unacked-updates 10;
>         mclt 60;
>         split 128;
>         auto-partner-down 60;
>         load balance max seconds 3;
> }
>
> subnet 62.231.191.160 netmask 255.255.255.252 {}
> include "/usr/local/etc/dhcpd.master";
>
> Second server uses same configuraton except of IP addresses
> and it uses identical dhcpd.master file containin rest of configuration.
>
>
When you say " Second server uses same configuraton ", I hope you did not
accidentally mark both as "primary".
Here is the config on one of my pairs, for comparision:

-------- first server ------------

failover peer "mydhcppair1"
{
primary;
address 141.211.147.232;
port 847;
peer address 141.211.147.248;
peer port 647;
max-response-delay 60;
max-unacked-updates 10;
mclt 1800;
split 128;
load balance max seconds 3;
}


-------- second server ------------

failover peer "mydhcppair1"
{
secondary;
address X.X.X.248;
port 647;
peer address X.X.X.232;
peer port 847;
max-response-delay 60;
max-unacked-updates 10;
load balance max seconds 3;
}

Note that "mclt" and "split" can only be specified on the primary.

-- 
Bob Harold
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20190102/d7815b5e/attachment.html>


More information about the dhcp-users mailing list