extremely slow DHCPD startup time

Simon dhcp1 at thehobsons.co.uk
Mon Aug 8 11:13:32 UTC 2022


Sten Carlsen <stenc at s-carlsen.dk> wrote:

> AFAIK the server will fill its internal tables with every possible lease, most of them empty or marked free. I could be wrong about this.

Correct. it builds hash tables for every address defined in a pool whether it’s used or not.

> So the question is how large your net blocks are?

...

Shane Merritt <smerritt at ua.edu> wrote:

> The total max available lease space according to dhcpd-pools is 283,751.  I did a restart on one of the pair earlier and the time was about 10 minutes, but activity is low at the moment since fall semester hasn’t begun.

Over 280k address, that’s “quite large”.
As above, that’s going to build a very large hash table in memory.

Now, I could be wrong, and hopefully someone who knows the internals will chime in on this ... I vaguely recall some discussion a long time ago about there being some tunable parameters (compile time) that can offer benefits when very large hash tables are involved. By default, these are optimised for much more modest networks, so it may be worth looking for these.


Of course, the other thing to consider is whether you have significant over-provisioning of address space. It could be tempting to say “there’s potentially 10k people/devices, and they might all want to connect to the same network segment at the same time, therefore every network segment must support 10k devices”. In reality, I would suspect that other than sizeable venues (sports/arts arenas) it’s unlikely that you’d get all your users in the same place at the same time. I guess if you are using dhcpd-pools then you’ll probably have some historical data to guide such an assessment.


Lastly, a bit about the leases file.
In accordance with the RFCs, every address defined in the config retains the history of its last usage “for ever”. So a device that appears once and never again will remain in the leases file until the address is finally re-used due to churn. This is to support the address stability requirement in the RFCs - basically if the device comes back and the address hasn’t been re-used yet, it will get it’s old address. That’s why you will see all the entries marked "binding state free”.

When it comes to address allocation, the order is (all subject to any filtering/allocation policies defined by groups and “allow members of ...” or “deny members of ...” rules) :
* Virgin addresses that have never been previously used. Due to the way the hash tables work, this is done in an implementation specific (undocumented and not guaranteed to remain the same) “top down” order, i.e. numerically highest address first.
* Previously used address (i.e. "binding state free”) in a least recently used order.

These two will cover normal operation.
If the server runs out of these two, then it will start to recover abandoned leases (leases marked abandoned via the “ping before offer” method).
And if you get to this step - there’s nothing available and the server will go “no free leases”.


The leases file itself is an “append only” file which will be added to every time there is a transaction - so you’ll probably have seen that at busy times it will grow quite quickly. It is also write-only and the server never reads it other than during startup. As you’ll also probably have observed, once an hour (again, compile time option), the server will write out a fresh copy from it’s in-memory tables (only one entry per lease) and this reduce the on-disk file to its minimum size.


Simon



More information about the dhcp-users mailing list