[Kea-users] kea-2.2.0 - HA cluster - communication between stork and dhcp4 gets lost
Stefan G. Weichinger
lists at xunil.at
Thu Jun 22 05:55:00 UTC 2023
We see this in the stork events every few days:
2023-06-20 20:01:23 daemon [2] dhcp4 is unreachable
2023-06-20 20:01:07 Communication with daemon [2] dhcp4 of app
kea at 10.0.0.231 failed
After a restart of both dhcp4 and stork-agent on that adc1-server things
work again.
I will have to check the logs in more detail, sure.
2 things:
1) we collect the prometheus metrics from stork and visualize them in
Grafana.
storkserver_auth_unreachable_machine_total{instance=~"$instance"}
is always 0, even when the mentioned events are seen and I would assume
that one of 2 machines should be marked unreachable. Right?
2) it's not solving the problem at the root, but I consider setting up
some external monitoring to detect this outage and let the monitoring
restart the daemons ...
I use monit (https://mmonit.com/wiki/Monit/ConfigurationExamples) for
such things, and think of letting it do http-API-calls to isc-kea to
check things.
Right approach?
thanks, Stefan
More information about the Kea-users
mailing list