Problems with pool balancing.

Randall C Grimshaw rgrimsha at syr.edu
Tue May 7 12:32:16 UTC 2013


We have also seen the wireless controllers show a preference for one peer over the other, but the servers have always balanced pretty well in the end. Our automated configuration update restarts one server at a time to avoid any appearance of outage to the clients. Having both servers down at the same time is not recommended but if you do I would start them both together to avoid peering 'issues'. I am completely lost in you chain of relays description and as I interpret it cannot imagine how it would even function - one helper should be sufficient to translate a level2 protocol to level3.

Randall Grimshaw rgrimsha at syr.edu
________________________________________
From: dhcp-users-bounces+rgrimsha=syr.edu at lists.isc.org [dhcp-users-bounces+rgrimsha=syr.edu at lists.isc.org] on behalf of Erling Paulsen [erling.paulsen at uit.no]
Sent: Tuesday, May 07, 2013 8:19 AM
To: Users of ISC DHCP
Subject: Re: Problems with pool balancing.

Update:

Can restarting our servers from typically 8-10 times a day cause
balancing problems? This is due to auto-updates from our
network-admin-tool triggering restarts. We do however restart the
secondary first, and waits for it to complete before restarting the primary.

But, I have come to suspect that our isc-dhcrelay (4.2.5-P1 on FreeBSD
8.3-RELEASE-p7) service is to blame. It receives forwarding packets from
about 100 Cisco routers relay-agents (ip-helpers on vlans) and then
again forwards to alle parties that needs a say in setting up client
parameters - including our primary and secondary dhcp-servers.

When i.e. counting DISCOVER packets for a busy day, I see that the
secondary server is about 13K packets short of the primary and that does
not sound right to me.

The dhcrelay (when running in debug mode) also complains about bad udp
checksums. This is worrying. I have no idea what/why that's all about?

Also, it probably doesn't help that it's running on a vmware host in a
pretty busy cluster!

- Erling


On 05/06/2013 01:44 PM, Erling Paulsen wrote:
> Hello,
>
> We have two servers in failover relationship, both running
> isc-dhcp42-server-4.2.4_2 on FreeBSD 8.3-RELEASE-p7 and we are having
> problems with balancing the pools when pool-usage climbs into the
> higher figures.
>
> Example (numbers from logfiles at about the same timestamp):
>
> Master: total 3905  free 1  backup 512  lts -255
> "lts" is correct according to the the documentation at (free -
> backup)/2 = -255
> Since "leases to share" is negative, the master expects the secondary
> server to hand over leases!
>
> Secondary: total 3905  free 513  backup 0  lts -256
> "lts" is correct according to the the documentation at (free -
> backup)/2 = -256
>
> They do seem to have the same understanding of the current lease
> situatioon. But! "lts" on secondary is also negative, so it's also
> expecting the master to hand over leases!
>
> This cannot possibly end well?
>
> This is what's in the source-code and it seems to comply with the
> description:
>
>                 if (p->failover_peer->i_am == primary) {
>                         lts = (p->free_leases - p->backup_leases) / 2;
>                         peer_lease_state = FTS_BACKUP;
>                         /* my_lease_state = FTS_FREE; */
>                         lq = &p->free;
>                 } else {
>                         lts = (p->backup_leases - p->free_leases) / 2;
>                         peer_lease_state = FTS_FREE;
>                         /* my_lease_state = FTS_BACKUP; */
>                         lq = &p->backup;
>                 }
>
> I don't understand what can be the cause of the double-trouble
> negative lts on both sides! And, btw, can someone shed a light on how
> the 'lq' pointer affects the balancing?
>
> Anyone have thoughts of what might be the culprit here?
> Any information would be appreciated.
>
>
> - Erling Paulsen
>


--
---------------------------------|sent-av|-----
Erling Paulsen, Seksjon for Infrastruktur/Nett
TEO 2.402, Universitetet i Tromsø, 9037 TROMSØ  .
Kontor (+47) 77 64 64 80 Mob (+47) 91 17 64 01 ..:

_______________________________________________
dhcp-users mailing list
dhcp-users at lists.isc.org
https://lists.isc.org/mailman/listinfo/dhcp-users


More information about the dhcp-users mailing list