Help with unresolvable domain (subdomain, actually)
Warren Kumari
warren at kumari.net
Wed Mar 2 18:49:09 UTC 2011
On Mar 2, 2011, at 1:20 PM, Kevin Darcy wrote:
> On 3/2/2011 10:34 AM, David Sparro wrote:
>>
>>
>> On 3/1/2011 5:27 PM, Kevin Darcy wrote:
>>> See my other post. This is designed-in behavior for Cisco GSSes,
>>> since
>>> there is no "service unavailable, try again later" RCODE.
>>>
>>
>> When the question is "what is the ip address of 'foo'" an answer of
>> "the web server is down" in nonsensical.
>>
> Hmmm... matter of perspective I suppose. Load-balancer architecture
> sees DNS as just the externally-visible portion of a whole
> subsystem. The SERVFAIL, in their view, does not communicate a DNS
> problem _per_se_, but a problem with the whole subsystem. It's more
> of a "what you're trying to get to is unavailable right now"
> message, communicated, in their view, _through_ DNS (as a sort of
> conduit), not necessarily _about_ DNS. They don't see it as
> specifically meaning "I've got a DNS problem".
But, everyone else *will*.
>
> I'm not saying I agree with this perspective, only that I've dealt
> with load-balancer vendors enough (Cisco in particular) to
> understand that this is where they're coming from.
>
> Besides, what alternative is there? If the load-balancer returns an
> address that it knows to not be working, then it's purposely causing
> the client to go into a relatively-slow connection-timeout failure
> mode. Is that responsible behavior? If it gives a "normal" response
> that is lacking answer information (NODATA, NXDOMAIN), then this
> response gets negatively cached, and the negative cache entry may
> delay clients from re-trying the resource even after it recovers.
> So, what's left? NOTIMP? FORMERR? REFUSED? NOTAUTH? Those aren't any
> better than SERVFAIL from a strictly functional perspective, and are
> even more misleading and confusing with respect to the real source
> of the problem.
A few options:
1: once the LB knows that all back-ends are down, it can continue to
answer with the correct A, but drop the TTL to be much shorter -- this
allows things to recover faster.
2: have the LB itself serve a 'sorry' page -- the ability to serve
static content locally should be simple, but if it not able to do so
it can always return a set of 'sorry' servers optimized for this
purpose.
You shouldn't be breaking both your serving *and* 'sorry' backends
often enough for there to be special handling needed (and, if you are,
you shouldn't make things worse by making other folk waste their time
debugging your problem).
W
>
> - Kevin
>
>
> _______________________________________________
> bind-users mailing list
> bind-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>
--
I had no shoes and wept. Then I met a man who had no feet. So I
said, "Hey man, got any shoes you're not using?"
More information about the bind-users
mailing list