DHCP Failover and DHCP relay question

Mon Mar 27 23:00:25 UTC 2006

On Mon, Mar 27, 2006 at 04:24:29PM -0500, Darren wrote:
> Mr. Hankins (or anyone who knows the answer to this question),

I don't think my dad subscribes to dhcp-users.  I guess I'll answer in
his proxy.

> v.3.0.3 of ISC DHCP
> 
> Lets say we have DHCP failover partners at 192.168.0.1 and 192.168.0.2 
> and a cable modem bootp server running on windows at 192.168.0.3 for 
> example.  The client DHCP of a cable modem network will be going to the 
> DHCP failover partners.  The cable modem bootp traffic will be going to 
> the windows bootp server.  It is not possible in this example to place 
> the bootp on the failover partners (actually, I don't think the ISC DHCP 
> server supports bootp on failover anyways).

That's correct, in order to enable failover you must configure 'no dynamic
bootp' on the failover address pools.  In all presently released versions
anyway.

But, if the bootp client does match a host statement, or if there is
another (non-failover) dynamic pool and dynamic-bootp is enabled, it
might get answered (and you can configure the same host statements on
both failover servers if you do want to move bootp into your dhcpd
pair, or simply just configure one pool on one server for bootp).

'ignore bootp;' I think might help avoid irregular responses.

> We have an old USR cable access router that will be acting as the DHCP 
> relay agent for both cable modems and the clients behind them.  While 
> this router does support failover, it does not support separation of the 
> client/cable modem traffic.  Therefore, to make this work we would set 
> the primary DHCP server on the cable access router to 192.168.0.3 so 
> that the cable modem bootp traffic goes to the windows server.  We will 
> set the secondary DHCP server to 192.168.0.1 so that the client DHCP 
> goes to the first failover server.
> 
> Can anyone tell me if this will work?

No.  But I can tell you: you're screwed.

> Specifically, what will happen if 
> the hash check on the client MAC Address says that the secondary DHCP 
> server should answer, but the secondary server never received the DHCP 
> Discover message from the client?  Will the primary client answer?  If 
> so, what parameters will determine if it answers (IE, what should these 
> options be set too:
> 
>         max-response-delay 5;
>         max-unacked-updates 5;
>         mclt 600;
>         split 128;
>         load balance max seconds 5;
> 
> This failover cluster will also be used by other DHCP Relay agents that 
> do indeed support failover properly, and many of which are not cable 
> modem networks at all.

OK.  If LBA hashes to the secondary, the primary will not answer unless
the client is configured in a fixed-address host {} statement, or the
lease is ACTIVE, or the 'secs' field of the DHCP packet received exceeds
your configured value of 5 ("load balance max seconds"), or the peer
state is not normal (vast oversimplification).

None of those are things you really want to rely on.  And when the
primary goes down, the secondary will never get the packets from the
relay to even hope to process them.  So it's of no use.

But I don't really see what you can do if you can only configure two
dhcp servers on the USR device.

Some folks on this list have played with 'anycast DHCP'.  It's a
functional design, but sadly one we don't support.  Basically with this
kind of topology, you really want whatever server receives the client's
packet to answer...because only one ever will.  You configure a 'role'
address which all servers have configured as lo0 aliases, and use
router protocols to make sure the nearest alive dhcp server gets the
client's packets (with some flow-cache magic to keep forwarding
consistent on source:dest ip:port hashes).

Overall it's the same kind of 'technology' used for the F root name
server, documented in detail by one Joe Abley:

	http://www.isc.org/index.pl?/pubs/tn/index.pl?tn=isc-tn-2004-1.txt
	http://www.isc.org/index.pl?/pubs/tn/index.pl?tn=isc-tn-2003-1.txt

You can use failover to make sure there aren't duplicate assignments
(but then you have to code-out LBA), or you can just use dhcp servers
that each have a different dynamic range configured (and then use more
than failover's limit of 2 servers).

That's a fairly not-well-beaten path for DHCP.  For one, it will take
source changes in ISC DHCP to support failover in it (which is important
if you can't throw infinite supplies of DHCP addresses at the problem).
For another, whereas the current failover implementation is fairly
laissez-faire with returning IP addresses to clients returning to the
network, without LBA in this anycast method you're basically accepting
total randomization (wether you use failover or not).  With failover
and no anycast, at least you have a hope things might get better in a
future release ("mac address affinity"), but there's just no way to
go that route with anycast.

See if you can get the USR to accept a protocol address that's a local
broadcast (eg 192.168.0.255) on the dhcp servers' wire.  It's a hack,
but it might work.

-- 
David W. Hankins		"If you don't do it right the first time,
Software Engineer			you'll just have to do it again."
Internet Systems Consortium, Inc.		-- Jack T. Hankins