Recovering from loss of leases.dhcpd on secondary of failover pair

Glenn Satchell Glenn.Satchell at uniq.com.au
Sat Jan 12 14:39:13 UTC 2008


>Date: Sun, 13 Jan 2008 01:23:17 +1100
>From: Nick Urbanik <nicku at nicku.org>
>To: dhcp-users at isc.org
>Subject: Re: Recovering from loss of leases.dhcpd on secondary of failover pair
>
>Dear Folks,
>
>On 13/01/08 00:39 +1100, Glenn Satchell wrote:
>>>From: Nick Urbanik <nicku at nicku.org>
>>>On 12/01/08 22:06 +1100, Nick Urbanik wrote:
>>>> This does not seem right.  I see none of the usual evidence of
>>>> rebalancing in the logs.  Can anyone suggest some way to trigger the
>>>> two machines to balance their leases?
>>>
>>>Do you think that I might cause any problems by restarting the
>>>secondary while it is in recover state, and the primary is in partner
>>>down state?
>>
>>My experience of shutting down the secondary, putting the primary
>>into partner down, and then starting the secondary is that the
>>primary immediately moves out of partner down mode. Now this was
>>without starting with an empty dhcpd.leases file, so it might behave
>>somewhat differently.
>>
>>Are you sure communication is working between the two dhcp servers?
>
>There is quite a bit of communication happening on the failover port;
>I count 3915 packets on that port measured (both directions) on the
>primary from 00:08:11.886681 to 00:19:40.265044, and 4092 packets in
>both directions measured on the failover port on the secondary from
>00:07:50.833351 to 00:19:48.767101.
>
>>What messages, if any, are there in the log files?
>
>The log file on the secondary acknowleges requests for private
>addresses for the cable modems, and for the most part, says things
>like this for the PC requests:
>dhcpd: DHCPREQUEST for 210.49.177.9 (211.31.132.45) from 00:0d:88:04:5f:b7 =
>via 220.237.136.1: not responding (recovering)
>There are lots of "peer holds all free leases" in the secondary log.
>
>The number of PC leases in the secondary lease file is only:
>active: 1028, free: 954
>active: 4125, backup: 327, free: 608
>
>whereas on the primary, we have:
>
>UNKNWON: abandoned: 1, active: 13787, backup: 26867, expired: 1, free: 7472
>abandoned: 1, active: 3837, backup: 359, expired: 2, free: 608
>abandoned: 1, active: 3859, backup: 333, expired: 2, free: 612
>active: 4569, backup: 448, expired: 4, free: 798
>active: 4533, backup: 483, expired: 5, free: 798
>active: 5957, backup: 474, expired: 3, free: 903
>active: 2992, backup: 274, free: 529
>active: 3068, backup: 257, free: 470
>abandoned: 1, active: 2004, backup: 911, expired: 9, free: 1375, released: 1
>active: 2962, backup: 270, free: 563
>active: 2319, backup: 1689, expired: 6, free: 2309, released: 1
>active: 1718, backup: 802, expired: 1, free: 1021
>active: 877, backup: 361, free: 540
>active: 4318, backup: 500, free: 748
>abandoned: 2, active: 4740, backup: 499, expired: 4, free: 827
>abandoned: 1, active: 2853, backup: 1060, expired: 5, free: 1394
>backup: 465
>active: 3341, backup: 365, free: 595
>active: 5481, backup: 391, expired: 9, free: 697
>active: 3496, backup: 351, free: 706
>active: 3757, backup: 377, expired: 10, free: 663
>active: 3188, backup: 331, free: 529
>active: 3268, backup: 376, free: 657
>active: 1021, backup: 197, free: 300
>active: 2496, backup: 300, free: 493
>backup: 791
>active: 4945, backup: 353, expired: 5, free: 769
>abandoned: 2, active: 5176, backup: 541, expired: 5, free: 853, released: 1
>abandoned: 1, active: 5776, backup: 467, free: 839
>abandoned: 1, active: 5898, backup: 427, expired: 3, free: 754, released: 1
>active: 4891, backup: 515, expired: 4, free: 915
>active: 5020, backup: 418, free: 887
>active: 4064, backup: 327, free: 669
>
>>What version of dhcpd?
>
>3.0.4

Ok, that seems like it is communicating, although slowly. You might
need to wait and see how longit takes to get those 900k leases copied
across. Maybe it will come out of recovery mode then.

regards,
-glenn


More information about the dhcp-users mailing list