Zone transfer failing
Chris Buxton
cbuxton at menandmice.com
Thu Jun 25 07:07:38 UTC 2009
On Jun 24, 2009, at 4:39 PM, Mark Andrews wrote:
> In message <DC7C615C-B326-461A-9257-65CD3BA06A89 at menandmice.com>,
> Chris Buxton
> writes:
>> On Jun 24, 2009, at 1:54 AM, Scott Haneda wrote:
>>> On Jun 23, 2009, at 11:57 PM, Chris Buxton wrote:
>>>> On Jun 23, 2009, at 3:16 PM, Scott Haneda wrote:
>>>>> Good observation. This is a long standing issue that I assumed
>>>>> was solved. Named on OS X will go deaf on port 53 tcp for some
>>>>> reason. I just kicked it, and now I can tcp dig it.
>>>>>
>>>>> $dig +tcp sugardimplesdesigns.com SOA @ns1.hostwizard.com +short
>>>>> ns1.hostwizard.com. scott.hostwizard.com. 2009062206 28800 7200
>>>>> 2419200 3600
>>>>>
>>>>> I now the men and mice guys are familiar with this, if you guys
>>>>> are reading, have you ever pinned this down, or found a solution
>>>>> to it?
>>>>
>>>> No, we have not. However, it appears to be related to the port
>>>> being idle for some time. Servers that use their TCP port more
>>>> frequently, usually due to having lots of zone updates that need to
>>>> be replicated to slaves, don't appear to be affected. You might try
>>>> creating a cron job that digs against the TCP port every 5 minutes
>>>> to try to keep the port "active" and prevent it from gong deaf.
>>>>
>>>> I could be wrong, but I seem to recall that we've seen this on
>>>> other operating systems as well, although the lion's share of
>>>> reports have been with Mac OS X.
>>>
>>>
>>> In your estimation, a named/BIND bug, or OS level bug? How would
>>> one go about finding out, and working it out, so we can solve this?
>>> I can certainly and very easily shove a launchd job in to run every
>>> 5 minutes, and do not even consider it much of a kludge, but would
>>> like to solve it for others, as it is a bear to track down.
>>
>> I'm not really qualified to determine where the bug is, but I would
>> guess there's a bug in the IP stack that most other server software
>> doesn't trigger. Perhaps ISC could find a way to work around it, or
>> perhaps Apple could fix it.
>
> It's a matter of reproducing it. I suspect there is a
> unhandled / unexpected error code being returned. Alternatively
> the interface disappears for a while, so we close the socket,
> and we can't re-bind to it when it re-appears because we
> no-longer have permissions to do so when running with -u.
The problem occurs when running named as root, without a chroot jail.
That's not the cause.
Nearly all of our customers who use Mac OS X do not use -u, -t, or any
other command line arguments. Those who do tend to run high-traffic
servers and have never reported the TCP port going deaf.
> If Apple, or anyone else for that matter, has fine grain
> permissions which will allow the ability to bind(2) to a
> reserved port as a ordinary user like Linux capability code
> does we would be happy to receive patches.
>
> If it is a re-binding issue the running without -u will
> address that.
>
>> My WAG is that Apple inherited this bug from *BSD, from which large
>> portions of the Darwin kernel are (or at least were) derived.
>
> I doubt that as named has always worked fine on the BSD
> kernel Apple used to start Darwin.
Except that my vague recollection is that I've seen this same bug on
FreeBSD. I'm not sure, though - it's been a while since I've seen it
on any kernel other than Darwin.
Chris Buxton
Professional Services
Men & Mice
More information about the bind-users
mailing list