intermittent SERVFAIL for high visible domains such as *.google.com
Brian J. Murrell
brian at interlinx.bc.ca
Thu Jan 18 15:59:58 UTC 2018
On Thu, 2018-01-18 at 15:41 +0000, Tony Finch wrote:
>
> Does the time to recovery correspond to the lame-ttl setting?
I am not sure. I'm not always aware of when it starts. I guess if I
am running a trace level permanently the log would tell me though.
> The default
> is 10 minutes - try reducing it and see if the outage becomes
> shorter.
If it does, what is that telling me? The problem domains are listing
NSes that don't actually host the zone? I thought named normally
logged lame delegations but I don't see a single one in the last few
days.
That said, if such a high-visibility domain as googles were
misconfigured, it would be wreaking havoc all over the Internet, and
drawing lots of attention wouldn't it?
> When you have a failure, try `rndc flushtree` to more selectively
> drop
> problematic state - you might have to find out the nameservers of the
> broken domain and flush them. (The google.com nameservers are under
> google.com; GitHub's are under dynect.net and a bunch of awsdns
> domains.)
rndc flushtree takes a domain name though doesn't it? In what case
would I need to find nameservers?
So, when I do rndc reload am I flushing the cache? :-(
> Look at the end of the dump - the address database,
; Address database dump
...
; ns3.google.com [v4 TTL 7] [v6 TTL 7] [v4
failure] [v6 failure]
; ns2.google.com [v4 TTL 7] [v6 TTL 7] [v4
failure] [v6 failure]
; ns1.google.com [v4 TTL 7] [v6 TTL 7] [v4
failure] [v6 failure]
; ns4.google.com [v4 TTL 7] [v6 TTL 7] [v4
failure] [v6 failure]
> bad cache,
Empty.
> and
> servfail cache.
Non-existent section in my database dump.
> > Do I need tracing enabled before the situation happens?
>
> That will make it a lot easier, yes :-)
>
> > What level (how many "rndc trace"s should I run)?
>
> You can specify a number directly, like `rndc trace 11` - level 11 is
> handy because it includes query and response packet dumps (er, but
> that
> is a 9.11 feature - in 9.9 you'll only get the response packets).
I'll set that trace now and hope to hit the problem again soon --
before I fill up my filesystem. :-)
Cheers,
b.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20180118/384bcd6b/attachment.bin>
More information about the bind-users
mailing list