[Kea-users] Kea HA issue with terminating connection

sebastians sebastians at selfnet.de
Mon Jun 12 20:24:44 UTC 2023


Hi Darren,

the memfile is not the problem, memory usage is at 22% and there is
still plenty of disk space. The logs I provided are from the
standby server. The kea-ctrl-agent logs are empty for the relevant time
interval.

If it helps, these are the logs on the active DHCP server leading
up to the termination:

Jun 12 15:04:31 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1, cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:04:31 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:04:31 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:04:39 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:04:39 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:04:39 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:04:45 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.ha-hooks.139625718580992] HA_LEASE_UPDATE_CONFLICT [hwtype=1], cid=[], tid=0x0: lease update to standby-dhcp (http://dhcp-2:8001/) returned conflict status code: ResourceBusy: IP address:123.123.123.123 could not be updated. (error code 4)
Jun 12 15:04:56 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:04:56 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:04:56 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625735366400] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:05:28 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:05:28 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:05:28 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625726973696] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:05:31 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625752151808] ALLOC_ENGINE_V4_ALLOC_FAIL_SUBNET [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 lease in the subnet 123.123.123.123/30, subnet-id 30926, shared network (none)
Jun 12 15:05:31 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625752151808] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1], cid=[], tid=0x0: failed to allocate an IPv4 address after 1 attempt(s)
Jun 12 15:05:31 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.alloc-engine.139625752151808] ALLOC_ENGINE_V4_ALLOC_FAIL_CLASSES [hwtype=1], cid=[], tid=0x0: Failed to allocate an IPv4 address for client with classes: ALL, HA_primary-dhcp, VENDOR_CLASS_MSFT 5.0, UNKNOWN
Jun 12 15:05:39 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.ha-hooks.139625701795584] HA_LEASE_UPDATE_CONFLICT [hwtype=1], cid=[], tid=0x0: lease update to standby-dhcp (http://dhcp-2:8001/) returned conflict status code: ResourceBusy: IP address:123.123.123.123 could not be updated. (error code 4)
Jun 12 15:05:39 dhcp-1 kea-dhcp4[564812]: WARN  [kea-dhcp4.ha-hooks.139625718580992] HA_LEASE_UPDATE_CONFLICT [hwtype=1], cid=[], tid=0x0: lease update to standby-dhcp (http://dhcp-2:8001/) returned conflict status code: ResourceBusy: IP address:123.123.123.123 could not be updated. (error code 4)
Jun 12 15:05:39 dhcp-1 kea-dhcp4[564812]: ERROR [kea-dhcp4.ha-hooks.139625718580992] HA_LEASE_UPDATE_REJECTS_CAUSED_TERMINATION too many rejected lease updates cause the HA service to terminate
Jun 12 15:05:39 dhcp-1 kea-dhcp4[564812]: ERROR [kea-dhcp4.ha-hooks.139625718580992] HA_TERMINATED HA service terminated due to an unrecoverable condition. Check previous error message(s), address the problem and restart!

Both servers are currently running version 2.3.8

Thanks,

Sebastian

On Mon Jun 12, 2023 at 8:50 PM CEST, Darren Ankney wrote:
> Hi Sebastian,
>
> I was hoping to see something jump out at me here, but it all looks
> good to me.  Is there some possibility that Kea is running out of
> memory or disk space for the memfile?  I assume these messages are
> appearing on the standby server?  It looks like this is using
> kea-ctrl-agent as the listener.  Is there anything being logged by
> kea-ctrl-agent?
>
> I see I never asked you what version of Kea this is, can you include
> that as well?
>
> Thank you,
>
> Darren Ankney
>
> On Mon, Jun 12, 2023 at 1:54 PM sebastians <sebastians at selfnet.de> wrote:
> >
> > Hi Darren,
> >
> > this is what Kea logs shortly before terminating the HA connection (any
> > personal information was removed from the logs):
> >
> > Mar 12 19:25:06 dhcp-2 kea-dhcp4[203037]: WARN  [kea-dhcp4.lease-cmds-hooks.139670034884352] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678688706, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 2907, "valid-lft": 43200 }, reason: ResourceBusy: IP address:123.123.123.123 could not be updated.)
> > Mar 12 19:25:06 dhcp-2 kea-dhcp4[203037]: WARN  [kea-dhcp4.lease-cmds-hooks.139670009706240] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678688706, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 2907, "valid-lft": 43200 }, reason: ResourceBusy: IP address:123.123.123.123 could not be updated.)
> > Mar 12 19:27:28 dhcp-2 kea-dhcp4[203037]: WARN  [kea-dhcp4.lease-cmds-hooks.139670009706240] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678688848, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 3812, "valid-lft": 43200 }, reason: ResourceBusy: IP address:123.123.123.123 could not be updated.)
> > Mar 12 19:32:05 dhcp-2 kea-dhcp4[203037]: WARN  [kea-dhcp4.lease-cmds-hooks.139670018098944] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678689125, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 274, "valid-lft": 43200 }, reason: ResourceBusy: IP address:100.72.19.33 could not be updated.)
> > Mar 12 19:32:34 dhcp-2 kea-dhcp4[203037]: WARN  [kea-dhcp4.lease-cmds-hooks.139670009706240] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "", "expire": 1678689154, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "", "hw-address": "", "ip-address": "", "state": 0, "subnet-id": 113, "valid-lft": 43200 }, reason: ResourceBusy: IP address:123.123.123.123 could not be updated.)
> > Mar 12 19:32:36 dhcp-2 kea-dhcp4[203037]: ERROR [kea-dhcp4.ha-hooks.139670104323840] HA_TERMINATED HA service terminated due to an unrecoverable condition. Check previous error message(s), address the problem and restart!
> > Mar 12 22:11:09 dhcp-2 kea-dhcp4[203037]: ERROR [kea-dhcp4.packets.139670138794688] DHCP4_BUFFER_RECEIVE_FAIL error on attempt to receive packet: Truncated DHCPv4 packet (len=0) received, at least 236 is expected.
> >
> > This is the HA part of the configuration on the standby dhcp server:
> >
> > "hooks-libraries": [{
> >     "library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_lease_cmds.so",
> >     "parameters": {}
> > },
> > {
> >     "library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_stat_cmds.so",
> >     "parameters": {}
> > },
> > {
> >     "library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_ha.so",
> >     "parameters": {
> >         "high-availability": [{
> >             "this-server-name": "standby-dhcp",
> >             "mode": "hot-standby",
> >             "heartbeat-delay": 10000,
> >             "max-response-delay": 60000,
> >             "max-ack-delay": 5000,
> >             "max-unacked-clients": 5,
> >             "peers": [{
> >                 "name": "primary-dhcp",
> >                 "url": "http://dhcp-1:8001/",
> >                 "role": "primary",
> >                 "auto-failover": true
> >             }, {
> >                 "name": "standby-dhcp",
> >                 "url": "http://dhcp-2:8001/",
> >                 "role": "standby",
> >                 "auto-failover": true
> >             }]
> >         }]
> >     }
> > }]
> >
> > The configuration on dhcp-1 is similar except for the "this-server-name"
> > attribute.
> >
> > Thanks,
> >
> > Sebastian
> >
> > On Sun Jun 11, 2023 at 12:35 PM CEST, Darren Ankney wrote:
> > > Hi Sebastian,
> > >
> > > Please share the HA messages surrounding and including one of these
> > > terminations.  The termination, or a proceeding message should state
> > > the cause.  The 'ResourceBusy' messages are not warnings I am familiar
> > > with.  Could you also send your HA configuration from each server
> > > (anonymized of course)?
> > >
> > > Thank you,
> > >
> > > Darren Ankney
> > >
> > > On Sat, Jun 10, 2023 at 11:51 AM sebastians <sebastians at selfnet.de> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > we recently migrated our DHCP setup from dhcpd to Kea. It runs on
> > > > two servers with hot standby and a memfile backend for the leases. Kea
> > > > assigns IP addresses for around 7000 pools.
> > > >
> > > > Over the past few months the HA connection terminated in random intervals.
> > > > From looking at the logs on the passive node I can see a lot of
> > > > 'ResourceBusy: IP address ... could not be updated' warnings prior to
> > > > the connection terminating. Since multithreading is enabled I suspected
> > > > this may be due to the threads encountering a resource lock on the memfile.
> > > > I suppose after the lease update fails a few times, the connection is terminated.
> > > >
> > > > Is the 'ResourceBusy' warning the cause for the terminating HA connection and
> > > > is there any way to fix the underlying issue? Any ideas on the issue are greatly
> > > > appraciated.
> > > >
> > > > Thanks,
> > > >
> > > > Sebastian
> > > > --
> > > > ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information.
> > > >
> > > > To unsubscribe visit https://lists.isc.org/mailman/listinfo/kea-users.
> > > >
> > > > Kea-users mailing list
> > > > Kea-users at lists.isc.org
> > > > https://lists.isc.org/mailman/listinfo/kea-users
> > > --
> > > ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information.
> > >
> > > To unsubscribe visit https://lists.isc.org/mailman/listinfo/kea-users.
> > >
> > > Kea-users mailing list
> > > Kea-users at lists.isc.org
> > > https://lists.isc.org/mailman/listinfo/kea-users
> >
> -- 
> ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information.
>
> To unsubscribe visit https://lists.isc.org/mailman/listinfo/kea-users.
>
> Kea-users mailing list
> Kea-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/kea-users



More information about the Kea-users mailing list