Question

Fri Jun 3 04:30:19 UTC 2022

I'm not at all sure your servers are running well, or that they're handling leases the way you think.

One probably trivial thing.
In the config you sent, you have a very odd split, of zero.
That forces all the balance to one side. Toward the secondary, IIRC. Perhaps you're only testing something I don't know. I think 128 is a "normal" split. (I don't think there's any good reason not to balance them evenly - at least I've never heard a use case that made sense.)

More toward things that sure seem like symptoms of your peers not communicating properly.

In the logs;
Do you see the two peers go to "normal" when you start them both up. And interrupted when one is down?
something like:
failover peer dhcp-failover: peer moves from normal to communications-interrupted 
failover peer dhcp-failover: I move from startup to normal 
failover peer dhcp-failover: peer moves from communications-interrupted to normal 
failover peer dhcp-failover: Both servers normal 

Are you seeing balance messages every hour as the two re-balance the available lease pool?

You say they are both handling leases properly, but how do you know this? (That a machine gets a lease from somewhere is not good evidence.)

A packet capture in front of the secondary might be helpful to see what traffic is passing - both to the peer and to clients.

(I hate making captures, at least as much as the next person, but dang if they don't, nearly always, show something that was different than I assumed. So, I've just gotten a lot less averse to getting captures. Yeah, they'll probably take me extra time to setup and get and paw through, [all when I could be fixin' stuff!] but they can save hours or days of fruitless searching for a fix, when I don't even really *know* what's wrong yet. Don't know about anyone else, but fixing problems gets a whole lot easier when I actually know what's wrong, or at least have a good idea what's going on. :)

-Greg

>     I don't think so.  The secondary server seems to have gone completely silent, now, but I am getting a ton of them on the primary server, now.

> On 6/2/2022 8:06 PM, Richard L. Hamilton wrote:
>> https://serverfault.com/questions/313008/isc-dhcp-fails-to-sync-leases-between-peers

>> Probably the two DHCP servers weren’t able to talk to each other, not surprising when one was having problems.

>>> On Jun 2, 2022, at 7:42 PM, Leslie Rhorer <lesrhorer at siliconventures.net> wrote: 

>>>     During troubleshooting of my recent issue, I got tons of duplicates of errors like the following.  They seem to have stopped, now, but I am curious what they meant.

>>> Jun  1 00:31:56 Backup dhcpd[15785]: DHCPDISCOVER from 60:01:94:f0:41:48 via enp11s0: not responding (recovering)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20220602/1639907f/attachment.htm>