Max number of subnets DHCP can handle

Mon Jan 28 20:32:47 UTC 2008

On 1/28/08 10:23 AM, David W. Hankins wrote:

> 
> that's a strange situation to be in.  it sounds like the secondary
> thinks the pools are "panic unbalanced", but the primary does not
> think it needs to transfer any leases.
> 
> i'd make sure the config files are consistent (including "failover
> peer" lines in the pools), and then compare the post-failover lease
> databases to see what's going wrong.  if anything were wrong, i would
> probably wind up "faulting" one of the lease databases so that it is
> reloaded off the peer.  since this sounds like a test enviroment,
> maybe you could just cut to that chase and try that.  just shutdown
> one daemon, delete dhcpd.leases and touch a new file, then restart the
> daemon; it will load the db off the peer, enter a RECOVER-WAIT state
> for MCLT, then resume normal operations (this isn't a bad test to
> perform either).
> 
> 3.0.1 also has several known failover flaws.  run 3.0.6 at least if
> you want to evaluate 3.0.x.
> 
> this sort of "pool rebalance cpu starvation" due to the peer sending
> never-ending POOLREQ messages has been addressed in our CVS repository
> for the next release; we now implement a hold-down timer on POOLREQs.
> 
> from 3.1.next's RELNOTES;
> 
> - A bug in failover pool rebalancing that caused POOLREQ message ping-pongs
>   was repaired.
> 
> - A flaw in failover pool rebalancing that could cause POOLREQ messages to
>   be sent outside of the min-balance/max-balance scheduled intervals has
>   been repaired.
> 
> but it's hard to say if the above would resolve your secondary's
> condition that's causing it to send the POOLREQs in the first place;
> e.g. any lease db inconsistency.

I've managed to par down the list of subnets to 1500, some monkeying 
around shows that, at least with 3.0.1, it eventually loads it and goes 
to "normal" status on  both servers.

I'd like to use 3.1.0, as it's supposed to do better lease affinity, I 
went with 3.0.1 because it was easy with RH and we already have a 
production system running 3.0.3, so it was a good quick and dirty way to 
see how the differences.

I'm in the process of starting up 3.1.0 with my new config file.  It was 
script generated so they two are identical, so I don't think we're 
having config issues.  However, it's spent a bout 45 minutes doing 
something to the leases file, generating what I'll describe below.  At 
this point, it looks like it's just a lot to generate, so it's taking a 
while, and once this initial process is completed, subsequent loads will 
be very short.  But in this case, the last log line reported is "update 
request from dhcp_failover: sending update" on both sides, and there's 
been no failover timeout.

> 
> pre whatsitwhosits?  in normal operation, "virgin" (never before used)
> leases are not written to the lease db, to save time and space.  in
> failover (of any version), some of these will change state to 'backup'
> (to be allocated by the secondary), and so will be recorded in the
> lease db.
> 
> this is normal in any version.

I'm a bit confused, I may have just been stating this poorly.   Both 
3.0.1 and 3.1.0 are generating some set of leases in the leases files, 
and in both cases I have done this with a virgin dhcpd.leases file (in 
both cases I"m using the smaller of my possible configs).  Inf v3.0.1, 
it has records for every possible lease, as such:

lease 172.22.255.133 {
   starts 6 2008/01/26 23:07:25;
   tstp 6 2008/01/26 23:07:25;
   tsfp 6 2008/01/26 23:07:25;
   binding state backup;
}
lease 172.22.255.134 {
   starts 6 2008/01/26 23:07:25;
   tstp 6 2008/01/26 23:07:25;
   tsfp 6 2008/01/26 23:07:25;
   binding state backup;
}

And similarly on the backup - it looks like the same setup of IPs, the 
ranges are 11-254, so I"m guessing this is the back half, 133-254.

On 3.1.0, as mentioned it's 45+ minutes into startup, with no log 
messages, so I'm guessing it's just finishing this up, but they look 
like this:

lease 172.31.99.254 {
   binding state free;
}
lease 172.31.99.253 {
   binding state free;
}
lease 172.31.99.252 {
   binding state free;
}

I think that I just mistated what it's doing David, as your explanation 
sounds like the above, but I want to make sure I have my head around 
this correctly.

> essentially;
> 
> leases are ordered in memory on a hash table which (as of 3.1.0) is
> not nearly so pessimal as 3.0.x.  several little missteps were fixed.
> i think 4.0.x is supposed to be faster still, but i can't remember
> why.
> 
> shared networks are ordered in memory on a linear list, and subnets
> are attached on a linear list beneath them.  it is possible to find a
> subnet from a lease, but for DISCOVER or SELECTING/INIT-REBOOT REQUEST
> processing, these lists may be traversed in order to calculate the
> client's point of network attachment (seeking a subnet that contains
> the giaddr, no seek is necessary if giaddr is not set since the daemon
> uses one socket per interface, and can just record what socket the
> packet came in on).
> 
> 
> so whereas lease lookup performance is (hopefully) O(1), although
> realistically O(leases) on 3.0.x and sufficiently large numbers of
> leases (it starts around 16k I think, from memory), subnet lookup
> performance is O(subnets) always.
> 
> so for each lease you add, it's a no-op.  for each subnet you add,
> you're using a teensy bit more CPU for this processing.
> 
> the question is whether or not you have enough cpu for the subnets you
> want.
> 
> short of rewriting the sources (to do a quicker subnet lookup), there
> is no workaround.
> 

Thanks for the description.   We're going to be provisioning new 
hardware, but this sounds like until possibly some enhancements in 
3.1.x, that may still not solve the pool rebalance CPU starvation issue, 
although it's probably hard to say.

I'm only shying away from 4.0.0 because of the criticality of this 
server (it's answering VOIP handset DHCP requests), and am wary of 
vX.0.0 - of anything, no offense intended.    I feel more comfortable 
with 3.1.0, and there is a specific feature it has that's desirable, 
related again to these VOIOP handsets, which don't tolerate changes in 
IP addresses well.     The reason for the large number of subnets is to 
avoid needing to make config changes, so DHCP server restarts should be 
minimal, and hopefully at no other times than server downtime.

David, thanks for your reply, this is all very helpful.    I was hoping 
that v3.1.0 would wrap up while I was writing this, but I'll have to 
follow-up with an update.   Once it's loaded and has it's leases file 
stabilized, I'll try doing a restart to see what timing looks like on 
that, and then I'll try a virgin leases file on the secondary and see 
how it does with that, although right now that doesn't strictly look 
necessary.   chris