TCP queries fail - BIND 9.5.0 Windows Server 2003

Jukka Pakkanen jukka.pakkanen at qnet.fi
Sat Jul 12 19:01:53 UTC 2008


----- Original Message ----- 
From: "Danny Mayer" <mayer at ntp.isc.org>
To: "Jukka Pakkanen" <jukka.pakkanen at qnet.fi>
Cc: <bind-users at isc.org>
Sent: Saturday, July 12, 2008 8:33 PM
Subject: Re: TCP queries fail - BIND 9.5.0 Windows Server 2003


> Jukka Pakkanen wrote:
>> Upgraded to 9.5.0b1, and still the same problem. Every W2K or W2k3 server 
>> runs out of memory in a few days. No matter is it has 1 or 4 gigs of RAM, 
>> named process just grows and grows...
>>
>
> I doubt that upgrading is likely to make much difference in that respect. 
> Why did you install b1? That's a prerelease version of 9.5.0. You 
> shouldn't be running it now that 9.5.0 is released.

Typo, 9.5.1b1.

> Note that named will grow its memory usage for about a week before it 
> stabilizes as it's caching a lot of DNS answers (both positive and 
> negative). One thing you might want to try is reducing the caching limits. 
> There are several things you can put into options:
> max-cache-size nnn (default is 32M)
> max-cache-ttl nnn (default is 7 days)
> msc-ncache-ttl nnn (default is 3 hours)

This is not normal cache filling. Like I wrote earlier, it will grow until 
all RAM, whether it's 1G or 4G server is used up. Never happened with 
pre-9.5.0 versions, used to be 20-30M. Now every Windows server we have, 
have the same problem. And reading from the list, other Windows users have 
the same problem.

> I don't have any recommended numbers for you to put in there but you can 
> try and reduce them from the default values.

I have tested different cache values like said in an earlier post. No 
effect. Memory is leaking... somewhere.

> Note that based on the error you may have run out of Windows handles. If 
> you can monitor named, check the number of handles that it's using. This 
> is not shown by default in the task manager, you have to add the column 
> handle count in from the View->Select Columns menu. While you are at it 
> add the thread count. This will give an idea of whether there is a handle 
> problem. Also run netstat -aon and count the number of open sockets by 
> named.

Will test that.



>> ----- Original Message ----- 
>> From: "Jukka Pakkanen" <jukka.pakkanen at qnet.fi>
>> To: <bind-users at isc.org>
>> Sent: Saturday, July 05, 2008 11:49 AM
>> Subject: Re: TCP queries fail - BIND 9.5.0 Windows Server 2003
>>
>>
>>> Anyone here running 9.5.0 in a windows server??
>>>
>>> Would be nice to know if everyone has the same problem with memory or 
>>> not.
>>>
>>> Jukka
>>>
>>> ----- Original Message ----- 
>>> From: "Vinny Abello" <vinny at tellurian.com>
>>> To: "Jukka Pakkanen" <jukka.pakkanen at qnet.fi>; <bind-users at isc.org>
>>> Sent: Saturday, July 05, 2008 11:43 AM
>>> Subject: RE: TCP queries fail - BIND 9.5.0 Windows Server 2003
>>>
>>>
>>> I hadn't noticed the memory usage, but it is possible. I think we have 
>>> around 2GB of RAM in these servers and they didn't appear to be 
>>> responding sluggishly as if they were swapping, so I'm not sure if it 
>>> has anything to do with the memory usage or not. I didn't feel like 
>>> checking everything under the sun while it wasn't working. I just wanted 
>>> it to work so I rolled back. It could be another resource issue like a 
>>> handle or some other resource leak I suppose.
>>>
>>>> -----Original Message-----
>>>> From: bind-users-bounce at isc.org [mailto:bind-users-bounce at isc.org] On
>>>> Behalf Of Jukka Pakkanen
>>>> Sent: Wednesday, July 02, 2008 4:58 PM
>>>> To: bind-users at isc.org
>>>> Subject: Re: TCP queries fail - BIND 9.5.0 Windows Server 2003
>>>>
>>>> I wonder if this is the same problem we are experiencing after
>>>> upgrading
>>>> from 9.4.2 to 9.5.0. Win2K and W2K3. Do you see any memory problems
>>>> with the
>>>> named process when this happens? Our named grows until the memory and
>>>> virtual memory is exhausted, and stops responding to queries.
>>>> Restarting the
>>>> service "solves" the problem. For few days.
>>>>
>>>> ----- Original Message -----
>>>> From: "Vinny Abello" <vinny at tellurian.com>
>>>> To: <mayer at ntp.isc.org>
>>>> Cc: <bind-users at isc.org>
>>>> Sent: Friday, June 27, 2008 6:37 AM
>>>> Subject: RE: TCP queries fail - BIND 9.5.0 Windows Server 2003
>>>>
>>>>
>>>> From: Danny Mayer [mayer at ntp.isc.org]
>>>> Sent: Thursday, June 26, 2008 10:20 PM
>>>> To: Vinny Abello
>>>> Cc: bind-users at isc.org
>>>> Subject: Re: TCP queries fail - BIND 9.5.0 Windows Server 2003
>>>>
>>>> Vinny Abello wrote:
>>>>> I recently upgraded from BIND 9.4.2 on Windows Server 2003 to BIND
>>>> 9.5.0. I was troubleshooting an issue today only to track it down to
>>>> the
>>>> fact that my name servers were no longer servicing requests on TCP port
>>>> 53. UDP queries continued to work without any issues. On one server I
>>>> noted in the logs:
>>>>> 16-Jun-2008 13:27:30.687 general: .\socket.c:1934: unexpected error:
>>>>> 16-Jun-2008 13:27:30.687 general: socket() failed: Invalid argument
>>>>>
>>>>> All of my name servers would not respond to TCP queries during my
>>>> tests. Eventually I restarted the BIND service on one of my name
>>>> servers
>>>> and everything came back to life and was working properly. I downgraded
>>>> back to BIND 9.4.2 for the time being.
>>>>> This appears to be a bug from what I can tell. When this was
>>>> happening, if I telnet to port 53, the socket connects, but as soon as
>>>> any data is sent, the socket is immediately closed. Again, a restart of
>>>> BIND seemed to fix it. This is on multiple servers as well.
>>>>> Has anyone else seen this? I'm cc'ing bind-bugs to file a bug report.
>>>>>
>>>>> -Vinny
>>>> What is going on when this happens? Are you doing zone transfers or
>>>> something else? If zone transfers, is the from or two the server?
>>>>
>>>> Danny
>>>>
>>>> Well, the servers currently (unfortunately) are setup as both recursive
>>>> resolvers in addition to being masters and slaves for over 1000 zones,
>>>> so
>>>> they're doing pretty much everything. The primary server that is the
>>>> master
>>>> for almost all the zones also experienced this problems as did the
>>>> slaves.
>>>> We were alerted to the problem when someone with an application that
>>>> did TCP
>>>> based DNS queries said they couldn't resolve anything.
>>>>
>>>>
>>
>>
>>
> 



More information about the bind-users mailing list