dhcpd process hitting data size limit

sthaug at nethelp.no sthaug at nethelp.no
Mon Mar 3 14:26:12 UTC 2008


I have a 3.1.0 server running as primary in a failover configuration,
around 100k leases, normal process size is around 90 - 100MB. Today the
dhcpd process on this server ballooned to over 500MB, and then hit the
default data size limit of 512 MB. In the logs I found the following:

Mar  3 14:23:21 dhcp2 dhcpd: dhcp_failover_put_message: something went wrong.
Mar  3 14:23:21 dhcp2 dhcpd: peer dhcp1-dhcp2: disconnected
Mar  3 14:23:21 dhcp2 dhcpd: failover peer dhcp1-dhcp2: I move from normal to communications-interrupted
Mar  3 14:23:22 dhcp2 dhcpd: uid lease 193.71.113.38 for client 00:00:e2:94:6a:61 is duplicate on 193.71.112/21
Mar  3 14:23:23 dhcp2 dhcpd: uid lease 81.191.9.183 for client 00:08:da:53:b9:df is duplicate on 81.191.0/20
Mar  3 14:23:26 dhcp2 dhcpd: dhcp_failover_put_message: something went wrong.
Mar  3 14:23:26 dhcp2 dhcpd: peer dhcp1-dhcp2: disconnected
Mar  3 14:23:26 dhcp2 dhcpd: failover: connect: no matching state.
Mar  3 14:23:26 dhcp2 dhcpd: no memory for option buffer.
Mar  3 14:23:26 dhcp2 dhcpd: no memory for option buffer.
Mar  3 14:23:26 dhcp2 dhcpd: no memory for option buffer.
Mar  3 14:23:26 dhcp2 dhcpd: no memory for option buffer.
Mar  3 14:23:26 dhcp2 dhcpd: no memory for option buffer.
Mar  3 14:23:26 dhcp2 dhcpd: no memory for option buffer.
(repeat ad nauseam)

On the failover peer, where the dhcpd process stayed at its normal size,
I found the following:

Mar  3 14:23:21 slam2 dhcpd: peer dhcp1-dhcp2: disconnected
Mar  3 14:23:21 slam2 dhcpd: failover peer dhcp1-dhcp2: I move from normal to communications-interrupted
Mar  3 14:23:22 slam2 dhcpd: uid lease 195.0.206.75 for client 00:17:3f:96:d8:06 is duplicate on 195.0.200/21
Mar  3 14:23:24 slam2 dhcpd: uid lease 81.191.61.134 for client 00:0b:82:0d:06:0a is duplicate on 81.191.48/20
Mar  3 14:23:26 slam2 dhcpd: peer dhcp1-dhcp2: disconnected
Mar  3 14:23:28 slam2 dhcpd: uid lease 81.191.126.180 for client 00:a0:c5:c0:35:ea is duplicate on 81.191.112/20
Mar  3 14:23:31 slam2 dhcpd: uid lease 193.90.168.171 for client 00:a0:c5:db:5a:97 is duplicate on 193.90.160/20
Mar  3 14:23:35 slam2 dhcpd: uid lease 81.191.199.70 for client 00:a0:c5:80:84:37 is duplicate on 81.191.192/20
Mar  3 14:23:40 slam2 dhcpd: uid lease 193.91.143.135 for client 00:17:3f:5c:28:64 is duplicate on 193.91.128/20
Mar  3 14:23:41 slam2 dhcpd: failover: link startup timeout
Mar  3 14:23:42 slam2 dhcpd: uid lease 81.191.182.196 for client 00:13:49:4a:c3:b0 is duplicate on 81.191.176/20
Mar  3 14:23:44 slam2 dhcpd: uid lease 81.191.2.218 for client 00:a0:c5:56:a5:cc is duplicate on 81.191.0/20
Mar  3 14:23:46 slam2 dhcpd: failover: link startup timeout
Mar  3 14:23:46 slam2 dhcpd: failover: link startup timeout

I ended up restarting the dhcpd process on both servers, and everything
seems to be back to normal now. Both servers are running FreeBSD 6.3.

So, my questions are:

- Any idea what might have happened here? As far as we know there's been
communication between the failover peers at all times.
- Any rules of thumb for how big the dhcpd process is expected to grow,
presumably based on number of leases?

Steinar Haug, Nethelp consulting, sthaug at nethelp.no


More information about the dhcp-users mailing list