Tons of "lease imbalance messages" then crash
ashley.hatch at unlv.edu
ashley.hatch at unlv.edu
Wed Feb 7 21:34:51 UTC 2007
I run a pair of ISC DHCPd 3.0.1-2 debian based servers, serving 5000+
clients reliably for over 4 years now. Recently (~6 months) we had to
upgrade from an earlier version of ISC DHCPd which did not work properly
with our redundant DHCP helpers. the old version never hiccuped once
(sadly I cannot recall the old version but it is from Dec 2003). The new
version solved the multiple DHCP helpers issue, but seems to have
introduced a new failover based problem and I wanted to get some input on
the problem before I blindly upgrade again, as I cannot find a bug fix
that I can say exactly matches our problem.
What happens is one the two servers, and it has occured to both but now
only occurs on the primary, will stop "hearing" DHCP requests and will
only log entries like:
Feb 7 12:23:40 merry dhcpd: lease imbalance - lts = 13
Feb 7 12:23:40 merry dhcpd: lease imbalance - lts = 7
Feb 7 12:23:40 merry dhcpd: lease imbalance - lts = 3
Feb 7 12:23:40 merry dhcpd: lease imbalance - lts = 8
..... 100's of times.
When it gets into this mode it stops doing any failover or DHCP service,
it will handle maybe one DHCP request to 100 lts log entries and it
creates a few dozen of the LTS log entries a second for minutes at a time.
Sometimes it comes back on its own, other times it will quietly exit with
no core dump or error message. While I have always seen the LTS messages
as part of normal operation, they are normaly spread apart and happen only
in small clusters, not 1000 at a time. I have verified that network
connectivity is not being lost at the servers by using constant pings both
to and from both machines to multiple hosts. I have also verified that
both the memory and processor are working properly using Memtest86+ and
Prime95 on both servers.
I have tried rebooting both boxes which does not help. It can go a month
without crashing but it has been doing it multiple times a day lately and
it could potentially be a real problem if it continues. I am at the point
of giving up and just upgrading to 3.0.4, but I hate changing versions
when I don't know what is causing the problem in the first place.
Any insight would be appreciated, or just "upgrade the server" from
someone wise in the newer versions.
Thanks,
Ashley Hatch
More information about the dhcp-users
mailing list