Esoteric question

glenn.satchell at uniq.com.au glenn.satchell at uniq.com.au
Tue Sep 17 14:29:54 UTC 2019


Hi Greg,

A very interesting problem... I've heard good reports about both those 
vendor's hardware, so sounds like a reasonable choice.

What do you get if you snoop eth1 while connected to the different WAN 
devices? I wonder if dhcpd is trying to talk to something else upstream 
(no idea why it would do that).

Does the Ubiquiti have some form of cloud management or call home setup?

Best of luck.

regards,
-glenn

On 2019-09-17 09:20, Gregory Sloop wrote:
> So, this is kind of a wild goose-chase for some direction - but
> thought there might be some useful answers here.
> 
> [But I know it's way out there and I'm not going to get direct help on
> solving the issue on the platform I'm having issues with - just bear
> with me and see if you have any helpful ideas.]
> 
> Let me set the background.
> 
> I'm using specific device hardware - in this case, a Mikrotik RB450G
> [currently in place] and moving to a Ubiquiti EdgeRouter lite.
> They're multi-ethernet interface routers - based on Linux.
> The RB450G works fine and simply needs replacement. [The two devices
> are configured as identically as I can. They're very different, so
> we're talking "functionally" identical, not literally with the same
> conf files.]
> 
> I'm having issues with DHCPd on the new device. [And queries at
> Ubiquiti are going nowhere fast. It IS an unusual problem, so I'm not
> terribly surprised.]
> 
> Lets assume Eth0/LAN is 10.0.0.1/24
> DHCPD is setup to hand out addresses for 10.0.0.20-100, say.
> 14440 second leases.
> Clients are connected directly to a switch that's directly connected
> to ETH0. [No DHCP relay etc.]
> 
> Eth1/WAN is a static /30 - connected directly to a Comcast Modem/BSG.
> Lets say 1.2.3.5/30
> The gateway [not that it matters is 1.2.3.6]
> 
> We're masquerading traffic [NAT] from the local RFC1918 [10.0.0.0/24]
> network to the static public IP on the WAN.
> 
> ---
> So, here's what happens/happened.
> 
> I went in to swap out the 'Tik box for the new hardware.
> Plug it in, and none of the clients on the LAN get DHCP addresses. All
> the DHCP clients time out.
> After several passes at testing here's what I find.
> 
> I can't find any configuration problems on the replacement hardware.
> The *old* 'Tik hardware/software works perfectly.
> 
> If we have the WAN connected to a simple live ethernet port on the
> *new hardware,* [EdgeRouter] DHCP works fine for the LAN side. Totally
> fine.
> Only when we plug in the Comcast gateway/modem into the WAN port on
> the new hardware does DHCP fail/timeout. [Remember just plugging it
> into a regular ethernet switch works fine. It won't pass traffic,
> because the static IP assignment isn't right - but the LAN side DHCP
> server works perfectly.]
> 
> If we take a client on the LAN and plug in a static IP [rather than
> DHCP], traffic flows out to the internet perfectly fine.
> 
> Packet caps from the new router show that the router/DHCP server IS
> seeing all the DHCP protocol handshake. [When it's having the
> "problem."]
> The client does a DISCOVER
> Server responds with OFFER
> The client responds with REQUEST
> Then there's a LONG pause. [like 90s+ worth.]
> The Server responds with ACK. [It actually appears to send several
> ACKS. I probably cut my captures too short, so I only have about 2m of
> capture in my largest one. But that's what I see in what I have.]
> However, the client [Windows in this case] has timed out, and never
> gets the ACK.
> And while I'm not 100% certain, the times I've looked, the device
> believes it's handed out a lease. [I believe it's in the leases file.]
> But because of the long delay, the client never actually got the
> lease.
> 
> Again,
> -simply unplugging the Comcast modem from the router, and DHCP
> immediately starts working again.
> -Plugging Eth1 into a live ethernet port [so that interface is seen as
> up] also works fine.
> -It's only when connected to the Comcast gateway/modem that it fails.
> 
> On the LAN side of the network, we've tinkered replacing the switches
> - dumb, identically configured managed switches, different manged
> switch, or no switch at all - simply plugged directly into a single
> client. No changes on the LAN side make the slightest difference
> either.
> 
> Since we're doing NAT/MASQ from LAN->WAN no WAN traffic should leak
> into the LAN - but I've also explicitly defined rules that prevent
> anything from the WAN getting to the LOCAL or LAN interfaces - other
> than established/related traffic.
> 
> So, I'm not asking for you to solve the issue on this particular
> hardware. What I'm asking for is some plausible explanation that might
> have these symptoms. I'm completely at wits end. I've spent a lot of
> hours trying a whole host of troubleshooting things - but I can't
> think of any possible way this could be happening. But clearly it is.
> 
> IMO, either we have some very weird hardware physical layer problem
> that only impacts DHCP [and not traffic routing] or there's something
> I'm missing. I'd normally imagine that I'm missing something - but
> can't figure out what, if anything.
> 
> I've tried to closely define the setup, but I'm sure I've forgotten
> something - perhaps lots of somethings - just ask and I'll try to
> clarify any missing pieces.
> 
> Given how awesome people on this list are, I'm hopeful someone will
> have something that might jiggle loose something useful!
> 
> TIA
> -Greg
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users


More information about the dhcp-users mailing list