tuning for maximum dhcp performance

Sat Apr 26 15:42:10 UTC 2008

That describes a cable modem network approximately twice our size in a
similarly rural area.  Our cable modem network is in about a dozen towns
served by a variety of electric utilities.  In the 7+ years of our network,
we've never had a single power outage affect more than two towns.  Unless
you have reason to believe otherwise, I think your experience may be
similar.

If you reboot the CMTS the CMs will go offline and have to DHCP once they
reacquire a signal, but the CPE behind them won't, so that's a saving grace.
The advantage you have with a CMTS is that there will be spread in CMs
getting back online that's not DHCP-bound but limited by the ability of the
CMs to range.  With our system I find it takes up to 20 minutes for every
possible CM to get back online (I wish I had a histogram showing CMs added
per minute -- I should poll the CMTS every 10 seconds next time I perform a
software upgrade).  Where I have less than 75 CMs per upstream port it's
only 5 to 10 minutes for them to get back online, but where there's more
than 150 it can take twice as long.  

Frank

-----Original Message-----
From: dhcp-users-bounce at isc.org [mailto:dhcp-users-bounce at isc.org] On Behalf
Of Dan
Sent: Saturday, April 26, 2008 8:14 AM
To: dhcp-users at isc.org
Subject: RE: tuning for maximum dhcp performance

Sure, it's a cable modem network.  Each CMTS will have 5000-10000 modems
with at least one CPE behind each.  Any regional power outage, CMTS or
reboot would cause a significant surge in requests.  It's a rural area, so
these things happen a little more frequently than you might assume.

On Fri, 25 Apr 2008, Frank Bulk - iNAME wrote:

> Would you be willing to describe your node environment that would
facilitate
> them coming online at the "same time"?  What kind of maintenance window
> would cause 10,000 nodes to do that?
>
> Frank
>
> -----Original Message-----
> From: dhcp-users-bounce at isc.org [mailto:dhcp-users-bounce at isc.org] On
Behalf
> Of Dan
> Sent: Friday, April 25, 2008 9:44 PM
> To: dhcp-users at isc.org
> Subject: RE: tuning for maximum dhcp performance
>
>
> I could easily lose up to almost 10,000 nodes at once and have them all
> come online at effectively the same time.  This could, and will I'm sure,
> occur during a maintenance window.
>
> I'd like to make the most of what I have.
>
>
> On Fri, 25 Apr 2008, Frank Bulk - iNAME wrote:
>
>> I serve up 10,000 leases ranging from 3 to 14 days.  I haven't spent a
>> second optimizing it.  It just works and has worked no matter what the
>> client outage conditions have been.
>>
>> Unless you're serving up a campus where there is a real possibility that
>> thousands of like clients (i.e. VoIP phone) may power up and come back
>> online, there's no need to spend time over-engineering.  If there were
20k
>> computers on a campus that lost power and power came back on
> simultaneously,
>> many of the PCs would stay off (configured in the BIOS), and those
>> configured to power on after power failure would reach the DHCP request
>> phase at different spots.  At 80/second, it would take just a bit over 4
>> minutes to serve them all (if the requests were linear).  Would it really
>> matter if in the worst of all cases it took 10 minutes for every client
to
>> be back online?
>>
>> It's those networks that serve hundreds of thousands of clients that need
> to
>> spend time engineering a solution that serves up IPs in a timely fashion.
>>
>> Frank
>>
>> -----Original Message-----
>> From: dhcp-users-bounce at isc.org [mailto:dhcp-users-bounce at isc.org] On
> Behalf
>> Of Dan
>> Sent: Friday, April 25, 2008 1:01 PM
>> To: dhcp-users at isc.org
>> Subject: tuning for maximum dhcp performance
>>
>>
>> I'm currently constructing a replacement for an old Cisco Network
>> Registrar setup serving about 20,000 nodes (10,000 with 24hr leases,
>> 10,000 with 7day leases).
>>
>> I'm running Linux 2.6.22 using ISC DHCPd 3.0.5 with dhcp-3.0.5-ldap-patch
>> and dhcp-3.0.5-next-file.patch.  I hope to use failover between the 2
>> servers, but haven't worked on that yet.
>>
>> As stated time and again, the software will not be the bottleneck. Using
>> dhcpref's discovery benchmark, I'm seeing about 80 clients/second right
>> now with my new hardware (ping-check off).  When I disable the per-lease
>> fsync or move the dhcpd.leases file to ramdisk, it jumps to well over 400
>> clients/second limited by CPU.
>>
>> My hardware is 2 servers with the following spec:
>>   Dell PowerEdge 2970
>>   Dual-core 2Ghz 64bit AMD
>>   4G RAM
>>   10k RAID1 System Drives
>>   15k RAID10 Storage Drives (just for dhcpd.leases file)
>>
>>
>> Do anyone have any pointers on running a system like this and achieving
>> maximum dhcp performance?
>>
>> Some factors that come to mind are:
>>   -Other patches I should/could be using?
>>   -Raid stripe element size, read-ahead, and write-back?
>>      (currently 64Kb, no, and yes)
>>   -Filesystem choice for dhcpd.leases file?
>>      (ext3, reiserfs, xfs, jfs -- currently resierfs)
>>   -Filesystem parameters to tune?
>>   -Kernel parameters to tune?
>>
>>
>> Having a better understanding about how DHCPd works with the dhcpd.leases
>> file might give me some of the answers to these questions also.
>>
>> Any information or shared experiences would be greatly appreciated.
>>
>> Thanks,
>>
>> Dan
>>
>>
>>
>>
>
>
>
>