ISC DHCP fails to sync leases between peers

Glenn Satchell glenn.satchell at uniq.com.au
Mon Sep 19 14:36:00 UTC 2011


Here is the problem, the failover protocol is disconnecting and causing
the syncing of leases to stop.

Sep 19 10:31:20 primary dhcpd: peer foo: disconnected

Make sure port 647 (your defined failover port) is open on any routers or
firewalls between the primary and secondary. The reverse connection looks
fine, as there is no error on the secondary.

Perhaps try packet sniffing on port 647 on each server to see if there is
any traffic.

regards,
-glenn

> I use ISC DHCP version 4.1.1 on Debian GNU/Linux on both servers. I tried
> to
> solve the following problem using various versions of ISC DHCP but it
> remained the same.
>
> My configuration for failover between two servers on different subnets is:
>
> #-----------------------------------------------
> # Primary Server
> #-----------------------------------------------
>
> authoritative;
> default-lease-time 900;
> max-lease-time 1800;
> option domain-name "foo.com";
> option domain-name-servers 10.12.0.254;
>
> failover peer "foo" {
>     primary;
>     address 10.12.0.254;
>     port 647;
>     peer address 10.10.10.12;
>     peer port 647;
>     max-response-delay 30;
>     max-unacked-updates 10;
>     load balance max seconds 3;
>     mclt 1800;
>     split 128;
> }
>
> subnet 10.12.0.0 netmask 255.255.0.0 {
>     pool {
>         failover peer "foo";
>         range 10.12.10.0 10.12.112.0;
>         range 10.12.112.12 10.12.255.254;
>         deny dynamic bootp clients;
>     }
>     option routers 10.12.0.254;
>     option subnet-mask 255.255.0.0;
>     option broadcast-address 10.12.255.255;
> }
>
> #-----------------------------------------------
> # Secondary Server
> #-----------------------------------------------
>
> authoritative;
> default-lease-time 900;
> max-lease-time 1800;
> option domain-name "foo.com";
> option domain-name-servers 10.12.0.254;
>
> failover peer "foo" {
>         secondary;
>         address 10.10.10.12;
>         port 647;
>         peer address 10.12.0.254;
>         peer port 647;
>         max-response-delay 30;
>         max-unacked-updates 10;
>         load balance max seconds 3;
> }
>
> subnet 10.12.0.0 netmask 255.255.0.0 {
>         pool {
>                 failover peer "foo";
>                 range 10.12.10.0 10.12.112.0;
>                 range 10.12.112.12 10.12.255.254;
>         deny dynamic bootp clients;
>         }
>     option routers 10.12.0.254;
>     option subnet-mask 255.255.0.0;
>     option broadcast-address 10.12.255.255;
> }
>
> subnet 10.10.10.0 netmask 255.255.255.240 {
> }
>
> IP helper (aka UDP helper) and DHCP relay is enabled on router that
> connects
> the network of the primary server with the network of the secondary server
> I
> can ping and ssh from one server to the other and back.
>
> When I start the dhcpd service on both servers they fail to balance their
> leases.
>
> I paste a sample of the logs of both servers
>
> Primary Server
>
> Sep 19 10:31:11 primary dhcpd: failover peer foo: I move from recover to
> startup
> Sep 19 10:31:11 primary dhcpd: failover peer foo: I move from startup to
> recover
> Sep 19 10:31:11 primary dhcpd: Sent update request all message to foo
> Sep 19 10:31:20 primary dhcpd: peer foo: disconnected
> Sep 19 10:31:22 primary dhcpd: failover peer foo: peer moves from
> recover-done to recover-done
> Sep 19 10:31:22 primary dhcpd: failover peer foo: peer moves from
> recover-done to recover-done
> Sep 19 10:31:45 primary dhcpd: DHCPINFORM from 10.12.181.177 via eth1
> Sep 19 10:31:45 primary dhcpd: DHCPACK to 10.12.181.177
> (00:17:42:c0:e3:ce) via eth1
> Sep 19 10:32:45 primary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> (PC1) via eth1: not responding (recovering)
> Sep 19 10:32:46 primary dhcpd: DHCPINFORM from 10.12.181.177 via eth1
> Sep 19 10:32:46 primary dhcpd: DHCPACK to 10.12.181.177
> (00:17:42:c0:e3:ce) via eth1
> Sep 19 10:32:49 primary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> (PC1) via eth1: not responding (recovering)
> Sep 19 10:32:57 primary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> (PC1) via eth1: not responding (recovering)
> Sep 19 10:33:13 primary dhcpd: DHCPDISCOVER from 00:19:99:95:41:99
> (PC2) via eth1: not responding (recovering)
> Sep 19 10:33:13 primary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> (PC1) via eth1: not responding (recovering)
> Sep 19 10:33:17 primary dhcpd: DHCPDISCOVER from 00:19:99:95:41:99
> (PC2) via eth1: not responding (recovering)
> Sep 19 10:33:25 primary dhcpd: DHCPDISCOVER from 00:19:99:95:41:99
> (PC2) via eth1: not responding (recovering)
> Sep 19 10:33:41 primary dhcpd: DHCPDISCOVER from 00:19:99:95:41:99
> (PC2) via eth1: not responding (recovering)
>
> Secondary Server
>
> Sep 19 10:31:11 secondary dhcpd: Update request all from foo: sending
> update
> Sep 19 10:31:23 secondary dhcpd: Wrote 22 leases to leases file.
> Sep 19 10:31:23 secondary dhcpd: failover peer foo: I move from
> recover-done to startup
> Sep 19 10:31:23 secondary dhcpd: failover peer foo: I move from
> startup to recover-done
> Sep 19 10:31:45 secondary dhcpd: DHCPINFORM from 10.12.181.177 via
> 10.12.0.1
> Sep 19 10:31:45 secondary dhcpd: DHCPACK to 10.12.181.177
> (00:17:42:c0:e3:ce) via eth0
> Sep 19 10:32:45 secondary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> via 10.12.0.1: not responding (recover done)
> Sep 19 10:32:46 secondary dhcpd: DHCPINFORM from 10.12.181.177 via
> 10.12.0.1
> Sep 19 10:32:46 secondary dhcpd: DHCPACK to 10.12.181.177
> (00:17:42:c0:e3:ce) via eth0
> Sep 19 10:32:49 secondary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> via 10.12.0.1: not responding (recover done)
> Sep 19 10:32:57 secondary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> via 10.12.0.1: not responding (recover done)
> Sep 19 10:33:13 secondary dhcpd: DHCPDISCOVER from 00:19:99:95:41:99
> via 10.12.0.1: not responding (recover done)
> Sep 19 10:33:13 secondary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> via 10.12.0.1: not responding (recover done)
> Sep 19 10:33:17 secondary dhcpd: DHCPDISCOVER from 00:19:99:95:41:99
> via 10.12.0.1: not responding (recover done)
> Sep 19 10:33:25 secondary dhcpd: DHCPDISCOVER from 00:19:99:95:41:99
> via 10.12.0.1: not responding (recover done)
> Sep 19 10:33:41 secondary dhcpd: DHCPDISCOVER from 00:19:99:95:41:99
> via 10.12.0.1: not responding (recover done)
> Sep 19 10:34:46 secondary dhcpd: DHCPDISCOVER from 00:1a:4b:45:3a:2f
> via 10.12.0.1: peer holds all free leases
> Sep 19 10:34:51 secondary dhcpd: DHCPDISCOVER from 00:1a:4b:45:3a:2f
> via 10.12.0.1: peer holds all free leases
> Sep 19 10:34:59 secondary dhcpd: DHCPDISCOVER from 00:1a:4b:45:3a:2f
> via 10.12.0.1: peer holds all free leases
> Sep 19 10:35:16 secondary dhcpd: DHCPDISCOVER from 00:1a:4b:45:3a:2f
> via 10.12.0.1: peer holds all free leases
> Sep 19 10:38:28 secondary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> via 10.12.0.1: not responding (recover done)
> Sep 19 10:38:32 secondary dhcpd: DHCPDISCOVER from 00:16:d3:e5:3a:3c
> via 10.12.0.1: not responding (recover done)
>
> I don't seem to have the load balance log lines so I don't think leases
> balancing is happening...
>
> Sent update request all message to foo
> Update request all from foo: sending update
>
> Balancing process seems stuck on the two lines above
>
> If I shut down the DHCPD daemon on one server the peer doesn't seem to
> take
> over even if it detects that other peer is down
>
> How can I fix this problem?
>
> Thank you in advance (and sorry for my bad English) :-)
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users





More information about the dhcp-users mailing list