Slave zone intermittently not refreshing

Thu May 8 15:15:46 UTC 2014

Tony Finch <dot at dotat.at> writes:

> Mart van de Wege <mvdwege at gmail.com> wrote:
>> Tony Finch <dot at dotat.at> writes:
>> > Mart van de Wege <mvdwege at gmail.com> wrote:
>> >>
>> >> How do I go about troubleshooting this issue to get a better idea of
>> >> what is going on?
>> >
>> > Are there any messages in your log containing the string " refresh: "?
>>
>> I have a couple, all of them 'retry limit for master $foo exceeded'.
>
> That implies that the SOA query (which checks if an XFR is necessary) is
> timing out.
>
That was more or less the direction my thoughts were heading too. But I
couldn't for the life of me find a way to debug that properly.

> Try running the following on the secondary to see what fails. If you have
> a TSIG key you will need to use the -k or -y options.
>
> 	dig soa $zone @$master
> 	dig +noedns soa $zone @$master
> 	dig +tcp soa $zone @$master
> 	dig axfr $zone @$master

Ahah.

My colleague was doing some 'dig'ging during the latest kerfluffle. I
will check to see if he ran any of these.

If not, I will have to wait until the lockup happens again.

I do know that the first one worked normally during the latest incident,
as I ran that myself.
>
> A lot of the refresh failure logging happens at debug level 1 so you can
> get more details by running `rndc trace 1`.
>
Is there a way to filter that after setting it? Because as mentioned,
this is also the master server for quite a few domains, so I expect lots
of logging when I turn on debug tracing.

Regards,

Mart