Problems with pool balancing.

Glenn Satchell glenn.satchell at uniq.com.au
Tue May 7 12:35:13 UTC 2013


Why don't you just get the routers to forward direct to the dhcp servers?
Just set two ip helper entries on each router.

The checksum could be nothing to worry about if the nic driver off loads
the checksum to the network card.

regards,
-glenn

On Tue, May 7, 2013 10:19 pm, Erling Paulsen wrote:
> Update:
>
> Can restarting our servers from typically 8-10 times a day cause
> balancing problems? This is due to auto-updates from our
> network-admin-tool triggering restarts. We do however restart the
> secondary first, and waits for it to complete before restarting the
> primary.
>
> But, I have come to suspect that our isc-dhcrelay (4.2.5-P1 on FreeBSD
> 8.3-RELEASE-p7) service is to blame. It receives forwarding packets from
> about 100 Cisco routers relay-agents (ip-helpers on vlans) and then
> again forwards to alle parties that needs a say in setting up client
> parameters - including our primary and secondary dhcp-servers.
>
> When i.e. counting DISCOVER packets for a busy day, I see that the
> secondary server is about 13K packets short of the primary and that does
> not sound right to me.
>
> The dhcrelay (when running in debug mode) also complains about bad udp
> checksums. This is worrying. I have no idea what/why that's all about?
>
> Also, it probably doesn't help that it's running on a vmware host in a
> pretty busy cluster!
>
> - Erling
>
>
> On 05/06/2013 01:44 PM, Erling Paulsen wrote:
>> Hello,
>>
>> We have two servers in failover relationship, both running
>> isc-dhcp42-server-4.2.4_2 on FreeBSD 8.3-RELEASE-p7 and we are having
>> problems with balancing the pools when pool-usage climbs into the
>> higher figures.
>>
>> Example (numbers from logfiles at about the same timestamp):
>>
>> Master: total 3905  free 1  backup 512  lts -255
>> "lts" is correct according to the the documentation at (free -
>> backup)/2 = -255
>> Since "leases to share" is negative, the master expects the secondary
>> server to hand over leases!
>>
>> Secondary: total 3905  free 513  backup 0  lts -256
>> "lts" is correct according to the the documentation at (free -
>> backup)/2 = -256
>>
>> They do seem to have the same understanding of the current lease
>> situatioon. But! "lts" on secondary is also negative, so it's also
>> expecting the master to hand over leases!
>>
>> This cannot possibly end well?
>>
>> This is what's in the source-code and it seems to comply with the
>> description:
>>
>>                 if (p->failover_peer->i_am == primary) {
>>                         lts = (p->free_leases - p->backup_leases) / 2;
>>                         peer_lease_state = FTS_BACKUP;
>>                         /* my_lease_state = FTS_FREE; */
>>                         lq = &p->free;
>>                 } else {
>>                         lts = (p->backup_leases - p->free_leases) / 2;
>>                         peer_lease_state = FTS_FREE;
>>                         /* my_lease_state = FTS_BACKUP; */
>>                         lq = &p->backup;
>>                 }
>>
>> I don't understand what can be the cause of the double-trouble
>> negative lts on both sides! And, btw, can someone shed a light on how
>> the 'lq' pointer affects the balancing?
>>
>> Anyone have thoughts of what might be the culprit here?
>> Any information would be appreciated.
>>
>>
>> - Erling Paulsen
>>
>
>
> --
> ---------------------------------|sent-av|-----
> Erling Paulsen, Seksjon for Infrastruktur/Nett
> TEO 2.402, Universitetet i Tromsø, 9037 TROMSØ  .
> Kontor (+47) 77 64 64 80 Mob (+47) 91 17 64 01 ..:
>
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users
>
>




More information about the dhcp-users mailing list