Performance issue ( maybe )
Glenn Satchell
glenn.satchell at uniq.com.au
Tue Sep 7 13:52:33 UTC 2010
On 09/07/10 17:58, Bjarne Blichfeldt wrote:
> ok the plot thickens..
>
>> -----Original Message-----
>> From: Glenn Satchell
>> Sent: 6. september 2010 15:09
>> To: Users of ISC DHCP
>> Subject: Re: Performance issue ( maybe )
>>
>> Ok, so this looks like some sort of networking issue, perhaps NIC,
>> cables or switch port? Check you have up to date drivers for your NICs.
>>
>> Run ifconfig and look for any errors or collisions. Check the speed and
>> duplex settings for the NIC and ask the network guys to check the same
>> settings on the switch port. Check cables are well seated in the server,
>> and if you can, on the switch.
>>
>> Try an ftp of a largish (few 10s of megabytes) file between each of the
>> servers and a third one to see if one works well and the other has some
>> problems? This will help isolate the system and give you a nice test case.
>>
>> Good luck, but at least a problem has been found, now to fix it!
>
> Agreed on checking the network, but so far everything seems to be in order, spanning tree, half/full duplex,
> no errors on the interfaces, no one else having issues, ftp from dhcp1 to dhcp2 runs close to 100Mb. No abnormal traffic.
>
> However, as Tom brought to my attention, there are an awful lot of pool balancing going on. So here is a thought:
> what if the pool balancing creates so much load on the dhcpservice, that the failover connection is lost ? That will create a
> runaway situation.
>
> The situation last week seems to have escalated after a configuration change. We do configuration changes by
> 1. pushing a new config to the primary server, then restart dhcpd.
> 2. pushing a new config to the secondary server, the restrt dhcpd
>
> In both cases, we get communications-interrupted.
communications-interrupted is expected when you stop one of the servers.
You may need to wait for a few minutes after restarting the first dhcpd
before restarting the second one. This is to allow time for it to
synchronise the leases.
Also, I always restart the secondary first. The reason behind this is
that if you add a new subnet and restart the primary, the running
secondary will complain about an unknown subnet when the primary tries
to sync the leases. Doing the secondary first avoids this issue.
> My failover clause is :
>
> failover peer "ipc-dhcp1-ipc-dhcp2" {
> primary;
> address 10.11.90.73;
> port 647;
> peer address 10.11.90.74;
> peer port 647;
> max-response-delay 90;
> max-unacked-updates 20;
> mclt 1800;
> split 128;
> load balance max seconds 5;
> }
>
> That means defaults for :
> min-balance 60;
> max-balance 3600;
>
> Tom Schmitt mentioned 2 hours, I assume min-balance time.
>
> My initial thought is to increase our values to:
> min-balance 1800;
> max-balance 7200;
>
Those settings should at least confirm whether or not the frequent
balancing is causing a problem. The numbers seem reasonable. I guess
min-balance needs to be big enough to allow a full resync of the leases
to copy across to the other server.
If you see too many addresses being balanced after half an hour you
might need to make it a bit smaller. I guess it depends on your lease
length, but ISTR you said these were quite large.
--
regards,
-glenn
--
Glenn Satchell | Miss 9: What do you
Uniq Advances Pty Ltd, Sydney Australia | do at work Dad?
mailto:glenn.satchell at uniq.com.au | Miss 6: He just
http://www.uniq.com.au tel:0409-458-580 | types random stuff.
More information about the dhcp-users
mailing list