[Kea-users] kea-2.2.0 - HA cluster - communication between stork and dhcp4 gets lost

Stefan G. Weichinger lists at xunil.at
Thu Jun 22 05:55:00 UTC 2023


We see this in the stork events every few days:

2023-06-20 20:01:23	daemon [2] dhcp4 is unreachable	

2023-06-20 20:01:07	Communication with daemon [2] dhcp4 of app 
kea at 10.0.0.231 failed

After a restart of both dhcp4 and stork-agent on that adc1-server things 
work again.

I will have to check the logs in more detail, sure.

2 things:

1) we collect the prometheus metrics from stork and visualize them in 
Grafana.

storkserver_auth_unreachable_machine_total{instance=~"$instance"}

is always 0, even when the mentioned events are seen and I would assume 
that one of 2 machines should be marked unreachable. Right?

2) it's not solving the problem at the root, but I consider setting up 
some external monitoring to detect this outage and let the monitoring 
restart the daemons ...

I use monit (https://mmonit.com/wiki/Monit/ConfigurationExamples) for 
such things, and think of letting it do http-API-calls to isc-kea to 
check things.

Right approach?

thanks, Stefan


More information about the Kea-users mailing list