Watching performance on a DHCP Server

Mon Feb 11 15:50:57 UTC 2008

Thanks Barr, it is always interesting to hear relative practical 
experiences. This is exactly the kind of problem I would like to 
prepare/plan for. I've read that Microsoft defaults to an 8 day lease 
time. ISC uses a default lease time of 10 minutes, with a max of 2 hours 
in their sample config included with 4.1.x.

We have successfully used 1 day leases in the past. Though I know some 
larger ISPs use 5 day, 7 day or even longer lease times.

I'm assuming that the main advantage to a short lease time is that hosts 
that join and leave a network give their leases up more rapidly (keeping 
IP pool usage as low as possible). The main advantage to longer lease 
times being load on the DHCP server. If I have a relatively stable 
network (only known macs are allowed) then it seems like a longer lease 
time (say 7-14 days) is more appropriate. And on a relatively stable 
cable or DSL network anything between 5-7 days seems acceptable? 
Volatile networks (wifi hotspots?) would probably benefit from a 1 hour 
or shorter lease time.

Does it sound like I am in the right ballpark with these figures?

-Blake

-------- Original Message  --------
Subject: Re: Watching performance on a DHCP Server
From: Barr Hibbs <rbhibbs at pacbell.net>
To: dhcp-users at isc.org
Date: Sunday, February 10, 2008 4:35:37 PM
> this experience is with a derivative of version 2 of the
> server, but as the basic functionality has not changed
> significantly for IPv4, it may be instructive....
>
> at the time, our environment had about 12,000 clients split
> roughly 55/45 between two servers...  each server was
> connected by two links to each of approximately 120 remote
> subnets, each link diversely routed to minimize disruption
> due to network problems, but also delivering 2 copies of
> every client message to the servers
>
> we suffered a massive regional power failure that lasted
> 2-1/2 days before complete restoration...  our clients
> received 7-day leases, largely grouped with their renewal
> times between 8 am and 6 pm, so in a 2-1/2 day outage, we
> could expect renewal requests to come from about half of our
> clients, and certainly init-reboot requests to come from
> all...  so, that is roughly 18,000 requests to be serviced
> as power is restored....
>
> of course, the power restoral didn't occur all at once, but
> was somewhat randomly distributed over a period of roughly
> 32 hours
>
> entirely by coincidence, we had instrumented the server to
> capture detailed message arrival rates and response times,
> expecting a normal, boring weekend...  but then the power
> failed, and...  we got lots more data than we expected!
>
> the real-time clock on our computers was capable of only 1
> millisecond resolution, so I must extrapolate....  our
> servers survived a nearly CONTINUOUS load of more than 1,000
> requests per second for 32 hours...
>
> of course, your mileage may vary, but by choosing an
> appropriate lease lifetime, you will probably see similar or
> better performance.
>
> --Barr Hibbs
>
>
>   
>> -----Original Message-----
>> From: dhcp-users-bounce at isc.org
>> [mailto:dhcp-users-bounce at isc.org]On
>> Behalf Of David W. Hankins
>> Sent: Friday, February 08, 2008 08:55
>> To: dhcp-users at isc.org
>> Subject: Re: Watching performance on a DHCP Server
>>
>>
>> On Thu, Feb 07, 2008 at 06:07:51PM -0600, Blake
>> Hudson wrote:
>>     
>>> By default in my distribution the leases file
>>>       
>> is stored in
>>     
>>> /var/lib/dhcpd/dhcpd.leases. This happens to be
>>>       
>> on a RAID1 array with
>>     
>>> 15k scsi disks and iostat shows the array as
>>>       
>> being maxed out once it
>>     
>>> reaches ~ 300 I/O's per second. DHCP logging is
>>>       
>> done asynchronously to
>>     
>>> the same array (which normally experiences ~ 50
>>>       
>> I/O ops). With CPU and
>>     
>>> memory barely breaking a sweat, this leads me
>>>       
>> to believe that the
>>     
>>> limitation is with the disks (lots of tiny writes).
>>>
>>> I could move the leases file to a different
>>>       
>> array, or to tmpfs, but
>>     
>>> before I do I just want to know if these
>>>       
>> results are typical and that I
>>     
>>> have interpreted the test data correctly and
>>>       
>> made the correct
>>     
>>> determination as to the bottleneck.
>>>       
>> those results are typical for that kind of
>> hardware, and you have
>> interpreted the test data correctly: fsync() is
>> the biggest
>> bottleneck.
>>
>> in 4.1.0a1, you will find a feature, however,
>> which was provided to
>> us in a patch by Christof Chen.  it permits the
>> server to queue
>> multiple ACKs behind a single fsync(); default 28
>> (576 byte DHCP
>> packets filling default socket buffer send
>> sizes).  the burst of acks
>> are sent presently if the sockets go dry, and
>> shortly will be backed
>> up with a sub-second timeout.
>>
>> it has some bugs we're working on, particularly
>> with failover, but
>> we'll address those in alpha.
>>
>> you may find that it provides some form of
>> multiplicative benefit to
>> your performance stats, since fsync() is the
>> bottleneck, and now there
>> are 28 acks per fsync max.
>>
>> so if you are only pushing 50 requests/s
>> currently, you may live
>> comfortably in a 250 request/s buffer for some
>> months until the
>> 4.1.x code is stable?
>>
>>     
>>> Also, I would appreciate any anecdotal evidence
>>>       
>> with regards to how many
>>     
>>> requests are typical in a large network under
>>>       
>> normal (or abnormal)
>>     
>>> conditions. If 10,000 users all of a sudden
>>>       
>> came online, how many
>>     
>>> requests would they really generate per second?
>>>       
>> there have been a few folks who suffered mass
>> power outages, i don't
>> know what search query to use, but you can find
>> them on the old
>> dhcp-server mailing list.  they did not report
>> problems, rather the
>> surprise at the lack of problem.
>>
>> --
>> Ash bugud-gul durbatuluk agh burzum-ishi krimpatul.
>> Why settle for the lesser evil?
>>     
> https://secure.isc.org/store/t-shirt/
> --
> David W. Hankins	"If you don't do it right the first time,
> Software Engineer		     you'll just have to do it again."
> Internet Systems Consortium, Inc.		-- Jack T. Hankins
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20080211/532df771/attachment.html>