[Kea-users] HA Load Balancing not working

Marcin Siodelski marcin at isc.org
Mon Oct 8 18:09:33 UTC 2018


Hello,

First of all, please note that the HA time values, i.e.
"heartbeat-delay", "max-ack-delay" and "max-response-delay" are
specified in milliseconds. Therefore, in your case heartbeat messages
will be sent every 10ms, which is very high frequency potentially
impacting server's performance. Also, "max-ack-delay" being a time after
which the server considers the DHCP packet to not be answered by the
partner is also very low in your case. If these low values are provided
for testing purposes only, this is fine.

I am guessing the problem whereby you don't observe transition of the
surviving server from "load-balancing" to "partner-down" may be caused
by the fact that you don't simulate different clients but only a single
client.

Your setting of "max-unacked-messages" of 10 means that the surviving
server will be watching traffic that should be processed by the offline
partner and it expects at least 10 messages from different clients to
not be answered before it transitions to "partner-down". If you're
sending DHCP messages from a single MAC address it counts as one client.
Try to lower the value of "max-unacked-messages" to 1 or even set it to
0 to disable this mechanism causing the surviving server to transition
to the "partner-down" state when it sees heartbeats to fail.

If your test tool allows for simulating many different clients, that's
even better, but the number of different clients should be at least 20.
That way, 10 should send DHCP queries to the dead server.

Kind Regards,
Marcin Siodelski
ISC DHCP Engineering

On 08.10.2018 15:07, sven.roehrig at web.de wrote:
>  
> 
>  
> 
> Hi,
> 
> i have a setup with 2 KEA servers in load-balancing configuration. Both
> server are working when active but when I shutdown one server to
> simulate an error I do get an OFFER but REQUESTS are not answered.
> 
> Do I have a wrong understanding on how the HA load-balancing works or
> maybe I have  a configuration issue. I expect server1 to enter
> partner-down state but I don´t see anything on logs except “The server
> is likely to be offline, error code 1”.
> 
> Why gets a request parked when a partner state is down “packet is
> parked, because a callout set the next step to PARK”?
> 
>  
> 
> "high-availability": [ {
> 
>                                                                
> "this-server-name": "server1",
> 
>                                                                 "mode":
> "load-balancing",
> 
>                                                                
> "heartbeat-delay": 10,
> 
>                                                                
> "max-ack-delay": 10,
> 
>                                                                
> "max-response-delay": 60,
> 
>                                                                
> "max-unacked-messages": 10,
> 
>                                                                 "peers": [
> 
>                                                                                 
> {
> 
>                                                                                
> "name": "server1",
> 
>                                                                                
> "url": "http://192.168.62.5:8080/",
> 
>                                         
>                                        "role": "primary",
> 
>                                                                                
> "auto-failover": true
> 
>                                                                                
> },
> 
>            
>                                                                     {
> 
>                                                                                
> "name": "server2",
> 
>                                                                                
> "url": "http://192.168.62.6:8080/",
> 
>                                                                                
> "role": "secondary",
> 
>                                                                                
> "auto-failover": true
> 
>                       
>                                                          }
> 
>                                                                 ]
> 
>                                                                 } ]
> 
>  
> 
>  
> 
> "high-availability": [ {
> 
>                                      
>                           "this-server-name": "server2",
> 
>                                                                 "mode":
> "load-balancing",
> 
>                                                                
> "heartbeat-delay": 10,
> 
>                      
>                                           "max-ack-delay": 10,
> 
>                                                                
> "max-response-delay": 60,
> 
>                                                                
> "max-unacked-messages": 10,
> 
>                                                                 "peers": [
> 
>                                                                                 
> {
> 
>                                                                                
> "name": "server1",
> 
>                                                                                
> "url": "http://192.168.62.5:8080/",
> 
>                                                                                
> "role": "primary",
> 
>                                         
>                                        "auto-failover": true
> 
>                                                                                
> },
> 
>                                                                                
> {
> 
>                             
>                                                    "name": "server2",
> 
>                                                                                
> "url": "http://192.168.62.6:8080/",
> 
>                                                                      
>           "role": "secondary",
> 
>                                                                                
> "auto-failover": true
> 
>                                                                                
> }
> 
>                                                                 ]
> 
>  
> 
>  
> 
> ./dhtest -i ens18 -m 00:00:00:11:11:12
> 
> DHCP discover sent        - Client MAC : 00:00:00:11:11:12
> 
> DHCP offer received      - Offered IP : 192.168.225.10
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
> DHCP request sent         - Client MAC : 00:00:00:11:11:12
> 
>  
> 
> 2018-10-08 14:50:11.639 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:13.641 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:15.645 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:17.648 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:19.651 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:21.654 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:23.657 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:25.660 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:27.664 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:29.667 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:31.670 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:33.673 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:35.676 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:37.679 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:39.682 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:41.476 INFO  [kea-dhcp4.options/1564] EVAL_RESULT
> Expression CLASS-CPE evaluated to 1
> 
> 2018-10-08 14:50:41.476 INFO  [kea-dhcp4.leases/1564] DHCP4_LEASE_ADVERT
> [hwtype=1 00:00:00:11:11:12], cid=[no info], tid=0xa4da4cd: lease
> 192.168.225.10 will be advertised
> 
> 2018-10-08 14:50:41.477 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:41.478 INFO  [kea-dhcp4.options/1564] EVAL_RESULT
> Expression CLASS-CPE evaluated to 1
> 
> 2018-10-08 14:50:41.481 INFO  [kea-dhcp4.leases/1564] DHCP4_LEASE_ALLOC
> [hwtype=1 00:00:00:11:11:12], cid=[no info], tid=0xa4da4cd: lease
> 192.168.225.10 has been allocated
> 
> 2018-10-08 14:50:42.483 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_LEASE_UPDATE_FAILED [hwtype=1 00:00:00:11:11:12], cid=[no info],
> tid=0xa4da4cd: lease update to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:43.484 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:45.488 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:45.700 INFO  [kea-dhcp4.options/1564] EVAL_RESULT
> Expression CLASS-CPE evaluated to 1
> 
> 2018-10-08 14:50:45.712 INFO  [kea-dhcp4.leases/1564] DHCP4_LEASE_ALLOC
> [hwtype=1 00:00:00:11:11:12], cid=[no info], tid=0xa4da4cd: lease
> 192.168.225.10 has been allocated
> 
> 2018-10-08 14:50:46.713 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_LEASE_UPDATE_FAILED [hwtype=1 00:00:00:11:11:12], cid=[no info],
> tid=0xa4da4cd: lease update to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:47.715 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:49.717 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:50.418 INFO  [kea-dhcp4.options/1564] EVAL_RESULT
> Expression CLASS-CPE evaluated to 1
> 
> 2018-10-08 14:50:50.434 INFO  [kea-dhcp4.leases/1564] DHCP4_LEASE_ALLOC
> [hwtype=1 00:00:00:11:11:12], cid=[no info], tid=0xa4da4cd: lease
> 192.168.225.10 has been allocated
> 
> 2018-10-08 14:50:51.435 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_LEASE_UPDATE_FAILED [hwtype=1 00:00:00:11:11:12], cid=[no info],
> tid=0xa4da4cd: lease update to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
> 2018-10-08 14:50:52.437 WARN  [kea-dhcp4.ha-hooks/1564]
> HA_HEARTBEAT_FAILED heartbeat to server1 (http://192.168.62.5:8080/)
> failed: unable to forward command to the dhcp4 service: No such file or
> directory. The server is likely to be offline, error code 1
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
> 2018-10-08 15:05:35.327 DEBUG [kea-dhcp4.packets/1602]
> DHCP4_SUBNET_SELECTED [hwtype=1 00:00:00:11:11:12], cid=[no info],
> tid=0x6569ccdb: the subnet with ID 1 was selected for client assignments
> 
> 2018-10-08 15:05:35.327 DEBUG [kea-dhcp4.packets/1602] DHCP4_SUBNET_DATA
> [hwtype=1 00:00:00:11:11:12], cid=[no info], tid=0x6569ccdb: the
> selected subnet details: 192.168.225.0/24
> 
> 2018-10-08 15:05:35.327 DEBUG [kea-dhcp4.hosts/1602]
> HOSTS_CFG_GET_ALL_IDENTIFIER get all hosts with reservations using
> identifier: hwaddr=000000111112
> 
> 2018-10-08 15:05:35.327 DEBUG [kea-dhcp4.hosts/1602]
> HOSTS_CFG_GET_ALL_IDENTIFIER_COUNT using identifier hwaddr=000000111112,
> found 0 host(s)
> 
> 2018-10-08 15:05:35.327 DEBUG [kea-dhcp4.dhcp4/1602]
> DHCP4_CLASS_ASSIGNED [hwtype=1 00:00:00:11:11:12], cid=[no info],
> tid=0x6569ccdb: client packet has been assigned to the following
> class(es): UNKNOWN
> 
> 2018-10-08 15:05:35.327 DEBUG [kea-dhcp4.dhcp4/1602]
> DHCP4_CLASS_ASSIGNED [hwtype=1 00:00:00:11:11:12], cid=[no info],
> tid=0x6569ccdb: client packet has been assigned to the following
> class(es): HA_server2, ALL, CLASS-CPE, UNKNOWN
> 
> 2018-10-08 15:05:35.327 DEBUG [kea-dhcp4.ddns/1602]
> DHCP4_CLIENT_HOSTNAME_PROCESS [hwtype=1 00:00:00:11:11:12], cid=[no
> info], tid=0x6569ccdb: processing client's Hostname option
> 
> 2018-10-08 15:05:35.328 DEBUG [kea-dhcp4.dhcpsrv/1602]
> DHCPSRV_MYSQL_GET_HWADDR obtaining IPv4 leases for hardware address
> hwtype=1 00:00:00:11:11:12
> 
> 2018-10-08 15:05:35.328 DEBUG [kea-dhcp4.hosts/1602]
> HOSTS_CFG_GET_ONE_SUBNET_ID_ADDRESS4 get one host with reservation for
> subnet id 1 and IPv4 address 192.168.225.10
> 
> 2018-10-08 15:05:35.328 DEBUG [kea-dhcp4.hosts/1602]
> HOSTS_CFG_GET_ALL_ADDRESS4 get all hosts with reservations for IPv4
> address 192.168.225.10
> 
> 2018-10-08 15:05:35.328 DEBUG [kea-dhcp4.hosts/1602]
> HOSTS_CFG_GET_ALL_ADDRESS4_COUNT using address 192.168.225.10, found 0
> host(s)
> 
> 2018-10-08 15:05:35.328 DEBUG [kea-dhcp4.hosts/1602]
> HOSTS_CFG_GET_ONE_SUBNET_ID_ADDRESS4_NULL host not found using subnet id
> 1 and address 192.168.225.10
> 
> 2018-10-08 15:05:35.328 DEBUG [kea-dhcp4.hosts/1602]
> HOSTS_MGR_ALTERNATE_GET4_SUBNET_ID_ADDRESS4 trying alternate sources for
> host using subnet id 1 and address 192.168.225.10
> 
> 2018-10-08 15:05:35.329 DEBUG [kea-dhcp4.dhcpsrv/1602]
> DHCPSRV_MYSQL_GET_ADDR4 obtaining IPv4 lease for address 192.168.225.10
> 
> 2018-10-08 15:05:35.329 DEBUG [kea-dhcp4.alloc-engine/1602]
> ALLOC_ENGINE_V4_REQUEST_EXTEND_LEASE [hwtype=1 00:00:00:11:11:12],
> cid=[no info], tid=0x6569ccdb: extending lifetime of the lease for
> address 192.168.225.10
> 
> 2018-10-08 15:05:35.329 DEBUG [kea-dhcp4.dhcpsrv/1602]
> DHCPSRV_MYSQL_UPDATE_ADDR4 updating IPv4 lease for address 192.168.225.10
> 
> 2018-10-08 15:05:35.332 INFO  [kea-dhcp4.leases/1602] DHCP4_LEASE_ALLOC
> [hwtype=1 00:00:00:11:11:12], cid=[no info], tid=0x6569ccdb: lease
> 192.168.225.10 has been allocated
> 
> 2018-10-08 15:05:35.332 DEBUG [kea-dhcp4.dhcp4/1602]
> DHCP4_CLIENTID_IGNORED_FOR_LEASES [hwtype=1 00:00:00:11:11:12], cid=[no
> info], tid=0x6569ccdb: not using client identifier for lease allocation
> for subnet 1
> 
> 2018-10-08 15:05:35.333 DEBUG [kea-dhcp4.callouts/1602]
> HOOKS_CALLOUTS_BEGIN begin all callouts for hook leases4_committed
> 
> 2018-10-08 15:05:35.333 DEBUG [kea-dhcp4.http/1602]
> HTTP_CLIENT_REQUEST_SEND sending HTTP request POST / HTTP/1.1 to
> http://192.168.62.5:8080/
> 
> 2018-10-08 15:05:35.333 DEBUG [kea-dhcp4.http/1602]
> HTTP_CLIENT_REQUEST_SEND_DETAILS detailed information about request sent
> to http://192.168.62.5:8080/:
> 
> POST / HTTP/1.1
> 
> Content-Length: 283
> 
> Content-Type: application/json
> 
>  
> 
> { "arguments": { "expire": 1539007935, "force-create": true, "fqdn-fwd":
> false, "fqdn-rev": false, "hostname": "", "hw-address":
> "00:00:00:11:11:12", "ip-address": "192.168.225.10", "state": 0,
> "subnet-id": 1, "valid-lft": 4000 }, "command": "lease4-update",
> "service":[ "dhcp4" ] }
> 
> 2018-10-08 15:05:35.333 DEBUG [kea-dhcp4.callouts/1602]
> HOOKS_CALLOUT_CALLED hooks library with index 2 has called a callout on
> hook leases4_committed that has address 0x7fc94ef83a10 (callout
> duration: 0.542 ms)
> 
> 2018-10-08 15:05:35.334 DEBUG [kea-dhcp4.callouts/1602]
> HOOKS_CALLOUTS_COMPLETE completed callouts for hook leases4_committed
> (total callouts duration: 0.542 ms)
> 
> 2018-10-08 15:05:35.334 DEBUG [kea-dhcp4.hooks/1602]
> DHCP4_HOOK_LEASES4_COMMITTED_PARK [hwtype=1 00:00:00:11:11:12], cid=[no
> info], tid=0x6569ccdb: packet is parked, because a callout set the next
> step to PARK
> 
> 2018-10-08 15:05:36.335 DEBUG [kea-dhcp4.http/1602]
> HTTP_SERVER_RESPONSE_RECEIVED received HTTP response from
> http://192.168.62.5:8080/
> 
> 2018-10-08 15:05:36.336 DEBUG [kea-dhcp4.http/1602]
> HTTP_SERVER_RESPONSE_RECEIVED_DETAILS detailed information about well
> formed response received from http://192.168.62.5:8080/:
> 
> HTTP/1.1 200 OK
> 
> Content-Length: 140
> 
> Content-Type: application/json
> 
> Date: Mon, 08 Oct 2018 13:05:35 GMT
> 
>  
> 
> [ { "result": 1, "text": "unable to forward command to the dhcp4
> service: No such file or directory. The server is likely to be offline" } ]
> 
> 
> 
> _______________________________________________
> Kea-users mailing list
> Kea-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/kea-users
> 




More information about the Kea-users mailing list