[Kea-users] Dropping the packets in load balancing

Darren Ankney darren.ankney at gmail.com
Sun May 21 11:08:07 UTC 2023


Hi Kraishak,

For question 1, the default is 10000 (or 10 seconds).  This parameter
defines the seconds that must be exceeded by the "secs" field in the
discover packet.  Combined with the number from max-unacked-clients
(default 10), this describes when auto-partner down will occur.  If
you have neither of these parameters set in your configuration, then
auto partner down will occur after there are 11 unique clients that
have 11 in their "secs" field.

As for what are sensible values for these parameters, the answer is
that it depends.  Only you can make that judgement.  Example.  If you
have 10s of thousands of clients, there are probably some very small
percentage that are broken (not able to obtain an IP for some reason)
and always have the secs field incremented, so you have to overcome
that low level of noise.  As for the amount of time to wait, that
depends on other factors, such as do your servers regularly lose
contact briefly?  If so, then it might be best to have that number
higher.  This is not an exhaustive list.

Kevin has given you the answer to question 2.  Expounding a little.
Setting  max-unacked-clients to zero will disable the feature of
waiting until clients are not receiving updates before entering
partner down.  This could result in a split brain scenario, if
disabled, where both servers allocate the same address to different
clients.  While you are testing, if using perfdhcp, please be sure and
use the -Y and -y options to simulate a client that has been waiting
for a lease.  Example:

perfdhcp -4 -r 1 -R 1000 -t 60 -l <interface> -p 300 -Y 10 -y 300

see: https://kea.readthedocs.io/en/latest/man/perfdhcp.8.html for details.

Thank you,

Darren Ankney

On Thu, May 18, 2023 at 6:18 AM Kraishak Mahtha <kraishak.edu at gmail.com> wrote:
>
> Hi Darren, Thank you for the quick response and explanation about the 1st case
>
> I want to confirm/take suggestions on two more failover parameters
> i.e. 1) The "max-ack-delay" parameter has a default value of 10000ms.
> --> I have set it to zero for this param in my latest config for testing currently, does setting this value to 4000 ms similar to the ISC default value of its equivalent parameter recommended in a production environment?
>
> 2)Regarding the UNACKED count parameter
>
>  HA_COMMUNICATION_INTERRUPTED_CLIENT4_UNACKED instead I get as below,
>
> --> When Server A is in the communication-recovery stage, after reaching out the total percentage of its free IPs I do get a log message like
> 2023-05-15 16:12:24.024 DEBUG [kea-dhcp4.ha-hooks/32411.139751604614912] HA_BUFFER4_RECEIVE_NOT_FOR_US [hwtype=1 15:98:14:00:00:49], cid=[01:15:98:14:00:00:49], tid=0x70119197: dropping query to be processed by another server
> 2023-05-15 16:12:24.024 INFO  [kea-dhcp4.ha-hooks/32411.139751604614912] HA_COMMUNICATION_INTERRUPTED_CLIENT4 [hwtype=1 15:98:14:00:00:4a], cid=[01:15:98:14:00:00:4a], tid=0x9079824: new client attempting to get a lease from the partner
> 2023-05-15 16:12:24.024 DEBUG [kea-dhcp4.ha-hooks/32411.139751604614912] HA_BUFFER4_RECEIVE_NOT_FOR_US [hwtype=1 15:98:14:00:00:4a], cid=[01:15:98:14:00:00:4a], tid=0x9079824: dropping query to be processed by another server
> 2023-05-15 16:12:24.504 INFO  [kea-dhcp4.ha-hooks/32411.139751579436800] HA_COMMUNICATION_INTERRUPTED_CLIENT4 [hwtype=1 15:98:14:00:00:4c], cid=[01:15:98:14:00:00:4c], tid=0x50272464: new client attempting to get a lease from the partner
> 2023-05-15 16:12:24.505 DEBUG [kea-dhcp4.ha-hooks/32411.139751579436800] HA_BUFFER4_RECEIVE_NOT_FOR_US [hwtype=1 15:98:14:00:00:4c], cid=[01:15:98:14:00:00:4c], tid=0x50272464: dropping query to be processed by another server
> 2023-05-15 16:12:24.514 INFO  [kea-dhcp4.ha-hooks/32411.139751587829504] HA_COMMUNICATION_INTERRUPTED_CLIENT4 [hwtype=1 15:98:14:00:00:5f], cid=[01:15:98:14:00:00:5f], tid=0x21365785: new client attempting to get a lease from the partner
> 2023-05-15 16:12:24.514 DEBUG [kea-dhcp4.ha-hooks/32411.139751587829504] HA_BUFFER4_RECEIVE_NOT_FOR_US [hwtype=1 15:98:14:00:00:5f], cid=[01:15:98:14:00:00:5f], tid=0x21365785: dropping query to be processed by another server
> 2023-05-15 16:12:24.514 INFO  [kea-dhcp4.ha-hooks/32411.139751587829504] HA_COMMUNICATION_INTERRUPTED_CLIENT4 [hwtype=1 15:98:14:00:00:60], cid=[01:15:98:14:00:00:60], tid=0x78905607: new client attempting to get a lease from the partner
> 2023-05-15 16:12:24.520 DEBUG [kea-dhcp4.ha-hooks/32411.139751587829504] HA_BUFFER4_RECEIVE_NOT_FOR_US [hwtype=1 15:98:14:00:00:60], cid=[01:15:98:14:00:00:60], tid=0x78905607: dropping query to be processed by another server
> 2023-05-15 16:12:24.529 INFO  [kea-dhcp4.ha-hooks/32411.139751579436800] HA_COMMUNICATION_INTERRUPTED_CLIENT4 [hwtype=1 15:98:14:00:00:63], cid=[01:15:98:14:00:00:63], tid=0x2925622: new client attempting to get a lease from the partner
>
> I don't know how this hashing algorithm decides that a peer need to grant a lease but for the same combination of mac address previously I got leases from the same server when they are both live, So basically HA_COMMUNICATION_INTERRUPTED_CLIENT4_UNACKED update count is not getting triggered, can we check this any other way or do we have any requirements to should match the condition to get it as valid count?
>
> One last, in one of the forums I see after updating this max-unacked-clients to zero the issue was solved, but before that I have a doubt is this suggestable to put zero in a production-like environment?
>
> Thanks
> Kraishak
>
>
>
> On Wed, May 17, 2023 at 4:03 PM Darren Ankney <darren.ankney at gmail.com> wrote:
>>
>> Hi Kraishak,
>>
>> This also appears to be designed behavior.  I found this:
>>
>> waiting - each started server instance enters this state. A backup
>> server transitions directly from this state to the backup state. An
>> active server sends a heartbeat to its partner to check its state; if
>> the partner appears to be unavailable, the server transitions to the
>> partner-down state. If the partner is available, the server
>> transitions to the syncing or ready state, depending on the setting of
>> the sync-leases configuration parameter. If both servers appear to be
>> in the waiting state (concurrent startup), the primary server
>> transitions to the next state first. The secondary or standby server
>> remains in the waiting state until the primary transitions to the
>> ready state.
>>
>> under: https://kea.readthedocs.io/en/kea-2.2.0/arm/hooks.html#server-states
>>
>> So, TLDR, the servers enter waiting state when they start.  If the
>> partner server does not appear, then the sole running server proceeds
>> directly to partner-down.
>>
>> Thank you,
>>
>> Darren Ankney
>>
>> On Tue, May 16, 2023 at 10:40 AM Kraishak Mahtha <kraishak.edu at gmail.com> wrote:
>> >
>> > Hi Peter,
>> >
>> > but only one
>> > HA_COMMUNICATION_INTERRUPTED_CLIENT4_UNACKED message.
>> >
>> > 2023-05-15 16:07:30.127 INFO  [kea-dhcp4.ha-hooks/32411.139751579436800] HA_COMMUNICATION_INTERRUPTED_CLIENT4_UNACKED [hwtype=1 34:98:b5:dc:1f:99], cid=[no info], tid=0x4be21c7f: partner server failed to respond, 1 clients unacked so far, 13 clients left before transitioning to the partner-down state
>> > -----> ok, but this seems to not apply to my first test case
>> > (i.e Where I have no active leases on both servers and started only one server after 6 iterations it went to partner down)
>> > 2023-05-15 15:18:07.006 INFO  [kea-dhcp4.ha-hooks/27021.140594439104704] HA_INIT_OK loading High Availability hooks library successful.
>> >
>> > Logs of my first test case:
>> > =====================
>> > 2023-05-15 15:18:07.014 INFO  [kea-dhcp4.ha-hooks/27021.140594439104704] HA_LOCAL_DHCP_DISABLE local DHCP service is disabled while the dhcp1 is in the WAITING state
>> > 2023-05-15 15:18:07.014 INFO  [kea-dhcp4.ha-hooks/27021.140594439104704] HA_SERVICE_STARTED started high availability service in load-balancing mode as primary server
>> > 2023-05-15 15:18:17.027 WARN  [kea-dhcp4.ha-hooks/27021.140594206709504] HA_HEARTBEAT_COMMUNICATIONS_FAILED failed to send heartbeat to dhcp2 (http://192.168.0.126:8001): Connection refused
>> > 2023-05-15 15:18:27.899 WARN  [kea-dhcp4.ha-hooks/27021.140594198316800] HA_HEARTBEAT_COMMUNICATIONS_FAILED failed to send heartbeat to dhcp2 (http://192.168.0.126:8001): Connection refused
>> > 2023-05-15 15:18:37.909 WARN  [kea-dhcp4.ha-hooks/27021.140594215102208] HA_HEARTBEAT_COMMUNICATIONS_FAILED failed to send heartbeat to dhcp2 (http://192.168.0.126:8001): Connection refused
>> > 2023-05-15 15:18:47.919 WARN  [kea-dhcp4.ha-hooks/27021.140594223494912] HA_HEARTBEAT_COMMUNICATIONS_FAILED failed to send heartbeat to dhcp2 (http://192.168.0.126:8001): Connection refused
>> > 2023-05-15 15:18:58.730 WARN  [kea-dhcp4.ha-hooks/27021.140594198316800] HA_HEARTBEAT_COMMUNICATIONS_FAILED failed to send heartbeat to dhcp2 (http://192.168.0.126:8001): Connection refused
>> > 2023-05-15 15:19:09.245 WARN  [kea-dhcp4.ha-hooks/27021.140594223494912] HA_HEARTBEAT_COMMUNICATIONS_FAILED failed to send heartbeat to dhcp2 (http://192.168.0.126:8001): Connection refused
>> > 2023-05-15 15:19:09.245 WARN  [kea-dhcp4.ha-hooks/27021.140594223494912] HA_COMMUNICATION_INTERRUPTED communication with dhcp2 is interrupted
>> > 2023-05-15 15:19:09.245 INFO  [kea-dhcp4.ha-hooks/27021.140594223494912] HA_STATE_TRANSITION server transitions from WAITING to PARTNER-DOWN state, partner state is UNDEFINED
>> >
>> > Can you guide me on how this worked?
>> >
>> > Thanks
>> > Kraishak
>> >
>> >
>> > On Tue, May 16, 2023 at 2:31 PM Peter Davies <peterd at isc.org> wrote:
>> >>
>> >> Hi Kraishak,
>> >>    Looking at your log file, it appears that "Server A" only saw one unacked
>> >> client, so it didn't transform to a partner-down state. I see several
>> >> HA_COMMUNICATION_INTERRUPTED_CLIENT4 messages
>> >> but only one
>> >> HA_COMMUNICATION_INTERRUPTED_CLIENT4_UNACKED message.
>> >>
>> >> 2023-05-15 16:07:30.127 INFO  [kea-dhcp4.ha-hooks/32411.139751579436800] HA_COMMUNICATION_INTERRUPTED_CLIENT4_UNACKED [hwtype=1 34:98:b5:dc:1f:99], cid=[no info], tid=0x4be21c7f: partner server failed to respond, 1 clients unacked so far, 13 clients left before transitioning to the partner-down state
>> >>
>> >> Your configuration contains the following statement:
>> >>    "max-unacked-clients": 13
>> >> The "max-ack-delay" parameter has a default value of 10000 ms.
>> >> I suggest you check that your traffic generator correctly increments the "secs"
>> >> field and to modify your HA settings appropriately.
>> >>
>> >> From the Kea ARM:
>> >>
>> >> HA_COMMUNICATION_INTERRUPTED_CLIENT4
>> >> %1: new client attempting to get a lease from the partner
>> >> This informational message is issued when the surviving server observes a DHCP
>> >> packet sent to the partner with which the communication is interrupted. The
>> >> the client whose packet is observed is not yet considered “unacked” because the
>> >> secs field value does not exceed the configured threshold specified with
>> >> max-ack-delay.
>> >>
>> >> HA_COMMUNICATION_INTERRUPTED_CLIENT4_UNACKED
>> >> %1: partner server failed to respond, %2 clients unacked so far, %3 clients left before transitioning to the partner-down state
>> >> This informational message is issued when the surviving server determines that
>> >> its partner failed to respond to the DHCP query and that this client is considered
>> >> to not be served by the partner. The surviving server counts such clients, and if
>> >> the number of such clients exceeds the max-unacked-clients threshold, the server
>> >> will transition to the partner-down state. The first argument contains client
>> >> identification information. The second argument specifies the number of clients
>> >> to which the server has failed to respond. The third argument specifies the number
>> >> of additional clients, which, if not provisioned, will cause the server to transition
>> >> to the partner-down state.
>> >>
>> >> Kind Regards Peter
>> >
>> > --
>> > ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information.
>> >
>> > To unsubscribe visit https://lists.isc.org/mailman/listinfo/kea-users.
>> >
>> > Kea-users mailing list
>> > Kea-users at lists.isc.org
>> > https://lists.isc.org/mailman/listinfo/kea-users
>> --
>> ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information.
>>
>> To unsubscribe visit https://lists.isc.org/mailman/listinfo/kea-users.
>>
>> Kea-users mailing list
>> Kea-users at lists.isc.org
>> https://lists.isc.org/mailman/listinfo/kea-users


More information about the Kea-users mailing list