[Kea-users] Kea schema meta-issues (was Host reservations in PostgreSQL)

Tomek Mrugalski tomasz at isc.org
Tue Jan 12 21:36:22 UTC 2016


On 04.01.2016 14:28, Shane Kerr wrote:
> I spent a short time looking at the schema for the PostgreSQL and I
> have a few questions before I go further.
> 
> ----
> 
> First, a question for both MySQL and PostgreSQL. There are certain
> values which are basically enumerated data types: 
> 
> - lease6_types
> - lease_hwaddr_source
> - lease_state
> 
> I guess it was an intentional choice not to use enum types here, but
> both MySQL and PostgreSQL support enum types, so I'm not sure of the
> motivation. I guess the reason is that the Kea C++ code uses constants
> defined elsewhere to match these (or perhaps these come from the RFC
> documents)? It's not a big deal, but in the interests of consistency I
> would generally put a foreign key constraint there.
Good point. Those tables are not used explicitly by the Kea code. We
just thought it would be useful to have them for making queries more
human readable.

I have no idea why they're not enums. I presume you can assign specific,
explicit values to enums, right? We do have those constants defined in
.h files and I'd like to keep them explicitly defined.

> Second, there is support in the MySQL schema that is missing in the
> PostgreSQL schema beyond the host reservations. For example, the MySQL
> schema includes information about hardware/MAC address in the lease6
> database. This was added to the MySQL schema in 2.0 of the schema.
Yup. I think it was never implemented in PostgreSQL. Our excuse is the
usual one - lack of time. Actually, there is more to that. We miss
hardware address in PostgeSQL, but also missing hwaddress source (in
both mysql and pgsql). Getting MAC address in DHCPv6 is a tricky
business, so depending on where the address was extracted from it may be
less or more trustworthy.

Also, there's a pull request that attempts to address some of the issue
in pgsql. Sadly, it's incomplete and abandoned:

https://github.com/isc-projects/kea/pull/9

> Does it make sense to try to get the PostgreSQL schema revised to
> include all of the missing bits from 2.0 of the schema before moving on
> to 3.0?
Yes, definitely.

> Speaking of schema versions... I assume that the schemas use something
> like semantic versioning? So 3.0 to 3.1 is a compatible change that
> adds new features but 3.1 to 4.0 is incompatible?
You guessed right. But you didn't have to. It's actually documented:
http://git.kea.isc.org/~tester/kea/guide/kea-guide.html#kea-database-version

We do have a backend version as well. This can be checked with using -V
on command-line. The original idea was to do a check whether a database
is correct and refuse to run if inconsistency between the code and the
database versions are detected. This is partially implemented and
currently not working as far as I can tell.

> What is the idea when it comes to supporting multiple databases? I ask
> because I assume that schemas cannot go backwards, so presumably the
> changes to add hardware/MAC address to lease6 would change the
> PostgreSQL from 3.0 to 4.0, even though there would be no change at all
> for the MySQL schema.
They're independent and don't have to match each other.

> I guess for simplicity it makes sense to have a single schema version,
> but it also means that administrators may be scared of updates because
> a 3.0 to 4.0 seems more serious than a 3.0 to 3.1... even if it does
> not affect their install at all.
Nope, we keep them separate for each backend. No need to scare
PostgreSQL folks when we're messing with MySQL :)

> Speaking of updating schemas... :)
> 
> In PostgreSQL it is possible to use support for the poorly-named
> SCHEMA abstraction along with triggers to perform transparent, in-place
> modification of schemas.
> 
> Basically you can do something like:
> 
> 1. start with a schema named something like kea_db_3.0 (default for the
>    kea user)
> 2. when a new version is created, make kea_db_4.0 along with a set of
>    triggers to keep data in sync between the two versions
> 3. migrate the data to the new version
> 4. change the default of the kea user to go to kea_db_4.0
> 5. sometime when everything has been working for a while drop the
>    kea_db_3.0 schema
> 
> http://www.postgresql.org/docs/9.4/static/ddl-schemas.html
No way. This is definitely too complicated to be robust. The only real
advantage of this approach is if you really, really want your downtime
to be minimal, so you're essentially running both old and new version at
the same time.

> I don't know if there is any support for such things in MySQL
> databases, but there are a lot of advantages to this approach, even
> though it involves more developer work.
Yup. And testing it would be a major task on its own. Our engineering
resources are limited, so I think they're better spent on making sure
that the upgrade procedure is working correctly vs. making it possible
to migrate back.

On a related note, it seems that so far all the changes we introduced
were always adding new columns or tables. If there's a case where user
determines that he absolutely must revert back to the old version, it
should be possible to drop the new columns and manually revert schema
version. This is purely a speculation, I don't recommend doing such a thing.

> It won't be top of my list, but this may be something I want to explore
> as well.
It's certainly an interesting concept. I suppose it could have benefits
for the case I mentioned - when you want to minimize upgrade downtime to
sub-seconds level, this could be a viable way to do it. But the
engineering overhead for this would be substantial, especially in the
testing area, so this is not something I'm eager to do.

Oh, one final thought before we start changing the schemas. One of the
important aspects here should be performance. I'm sure there are better,
safer ways of doing things with tons of triggers, constraints, foreign
keys etc., but we should understand what's their impact on performance
is. Fortunately, we do have perfdhcp, which can measure backend
performance reliably and repetitively, so it's easy to assess the
impact. Couple leases less per second is a fair price to pay for better
self-consistency, but if the performance goes down by a lot, we may have
second thoughts whether to go ahead with a change or not.

Thanks a lot for running Kea and for your willingness to improve it!

Tomek




More information about the Kea-users mailing list