[Kea-users] [EXTERNAL] HA use-cases for large enterprises?

Geist, Dan (CCI-Atlanta) Dan.Geist at cox.com
Tue Aug 22 19:47:02 UTC 2023


Thanks, Darren. It sounds like there’s some potential for the idea with caveats:

> On Aug 22, 2023, at 3:29 PM, Darren Ankney <darren.ankney at gmail.com> wrote:
> 
> Hi Dan,
> 
>> After looking through the various HA options presented in the admin manual, a question arises around scaling and HA setup options: What happens to the model when a common db lease backend (MySQL or PostgreSQL) is used?
> 
> If you mean both using the same database (and, therefore, the same
> lease4 or lease6 tables), then you would want to disable
> "send-lease-updates" and "sync-leases" (see:
> https://kea.readthedocs.io/en/kea-2.4.0/arm/hooks.html*lease-information-sharing
Yes, that’s what I would intend to do.
> 
>> If most of the transactional latency in using the HA models is due to lease updates being propagated to backups, how does that change when a common backup is in place?
> 
> If you are referring to Kea HA and not database HA , and assuming a
> common database, and so lease syncing being disabled, There would be
> no propagation in that sense.  There is a somewhat substantial
> performance penalty to using MySQL/MariaDB or PostgreSQL for lease
> storage, however (see here:
> https://kea.readthedocs.io/en/kea-2.4.0/arm/dhcp4-srv.html*multi-threading-settings-with-different-database-backends
> and here: https://reports.kea.isc.org/performance/stable/2.4.0/report.html
> Please note that MySQL 8 does not perform well with Kea (see here:
> https://kea.readthedocs.io/en/kea-2.4.0/arm/admin.html*mysql-5-7-vs-mysql-8-vs-mariadb-10-and-11
PostgreSQL it is. We’ll see if it’s fast enough.

> 
>> Further, utilizing one of the new features of 2.4 ( **Early allocation**: RFC2131 ) to help with possible lease collisions, would it be possible to create an n-node (horizontally scaled) cluster of servers without a native HA scheme but WITH a highly performant lease and configuration backend? We would likely run ECMP and Anycast on the cluster nodes for the listening IP(s) such that for a given client/server transaction, there would be only one service node communicating with that client. In this way, the drawbacks of HA blocking transactions are avoided by making the service nodes completely unaware of each other and relying on the DHCP service behavior and current lease data to avoid conflicts.
> 
> I'm not sure which new feature you are referring to.  I did not find
> "Early allocation" mentioned in RFC2131.  Perhaps you were referring
> to Temporary Allocation on Discover
> https://kea.readthedocs.io/en/kea-2.4.0/arm/dhcp4-srv.html*temporary-allocation-on-dhcpdiscover 
> This can make it easier to use a shared database with the servers as
> it avoids a race condition where two clients might be offered the same
> address but only one of them will actually be able to finish the
> process while the other will have to discover again.
Early location is the one I’m thinking off, yes. We’re okay with two requests being sent and only one completing. This is what happens today (with primary and secondary boot relay targets to an active/standby pair).
> 
>> The eventual intent with this setup would be to also include bi-directional and non-blocking db replication to one or more “disaster sites” which would be kept (almost) in sync with the data in POPs but would only ever receive client traffic in the case of catastrophic site failure; absolute sync being not as important as the simple ability to serve the subnets in question.
> 
> You might be interested in using the passive backup role (see:
> https://kea.readthedocs.io/en/kea-2.4.0/arm/hooks.html*passive-backup-configuration
> where more Kea servers can receive lease updates and could be stored
> using database or memfile.  In the event of catastrophic failure, this
> passive backup could be turned into an active server by changing
> "this-server-name" in the configuration.  You could have several of
> these backups, I believe.  Failing back would be a small challenge,
> however, as the leases would need to be synced in some way.
I looked at the “backup” role and had the same conclusion. It could be made to be less I/O bound by not having services run but there would still be a data integrity issue on fall-back 9as well as a non-automatic failover.

Appreciate the comments. We’ll keep working on the concept and advice if anything interesting is found.

Dan



More information about the Kea-users mailing list