Zone transfer failing

Chris Buxton cbuxton at menandmice.com
Thu Jun 25 07:07:38 UTC 2009


On Jun 24, 2009, at 4:39 PM, Mark Andrews wrote:
> In message <DC7C615C-B326-461A-9257-65CD3BA06A89 at menandmice.com>,  
> Chris Buxton
> writes:
>> On Jun 24, 2009, at 1:54 AM, Scott Haneda wrote:
>>> On Jun 23, 2009, at 11:57 PM, Chris Buxton wrote:
>>>> On Jun 23, 2009, at 3:16 PM, Scott Haneda wrote:
>>>>> Good observation.  This is a long standing issue that I assumed
>>>>> was solved.  Named on OS X will go deaf on port 53 tcp for some
>>>>> reason.  I just kicked it, and now I can tcp dig it.
>>>>>
>>>>> $dig +tcp sugardimplesdesigns.com SOA @ns1.hostwizard.com +short
>>>>> ns1.hostwizard.com. scott.hostwizard.com. 2009062206 28800 7200
>>>>> 2419200 3600
>>>>>
>>>>> I now the men and mice guys are familiar with this, if you guys
>>>>> are reading, have you ever pinned this down, or found a solution
>>>>> to it?
>>>>
>>>> No, we have not. However, it appears to be related to the port
>>>> being idle for some time. Servers that use their TCP port more
>>>> frequently, usually due to having lots of zone updates that need to
>>>> be replicated to slaves, don't appear to be affected. You might try
>>>> creating a cron job that digs against the TCP port every 5 minutes
>>>> to try to keep the port "active" and prevent it from gong deaf.
>>>>
>>>> I could be wrong, but I seem to recall that we've seen this on
>>>> other operating systems as well, although the lion's share of
>>>> reports have been with Mac OS X.
>>>
>>>
>>> In your estimation, a named/BIND bug, or OS level bug?  How would
>>> one go about finding out, and working it out, so we can solve this?
>>> I can certainly and very easily shove a launchd job in to run every
>>> 5 minutes, and do not even consider it much of a kludge, but would
>>> like to solve it for others, as it is a bear to track down.
>>
>> I'm not really qualified to determine where the bug is, but I would
>> guess there's a bug in the IP stack that most other server software
>> doesn't trigger. Perhaps ISC could find a way to work around it, or
>> perhaps Apple could fix it.
>
> 	It's a matter of reproducing it.  I suspect there is a
> 	unhandled / unexpected error code being returned.  Alternatively
> 	the interface disappears for a while, so we close the socket,
> 	and we can't re-bind to it when it re-appears because we
> 	no-longer have permissions to do so when running with -u.

The problem occurs when running named as root, without a chroot jail.  
That's not the cause.

Nearly all of our customers who use Mac OS X do not use -u, -t, or any  
other command line arguments. Those who do tend to run high-traffic  
servers and have never reported the TCP port going deaf.

> 	If Apple, or anyone else for that matter, has fine grain
> 	permissions which will allow the ability to bind(2) to a
> 	reserved port as a ordinary user like Linux capability code
> 	does we would be happy to receive patches.
>
> 	If it is a re-binding issue the running without -u will
> 	address that.
>
>> My WAG is that Apple inherited this bug from *BSD, from which large
>> portions of the Darwin kernel are (or at least were) derived.
>
> 	I doubt that as named has always worked fine on the BSD
> 	kernel Apple used to start Darwin.

Except that my vague recollection is that I've seen this same bug on  
FreeBSD. I'm not sure, though - it's been a while since I've seen it  
on any kernel other than Darwin.

Chris Buxton
Professional Services
Men & Mice




More information about the bind-users mailing list