failed: out of memory
Thomas Schulz
schulz at adi.com
Tue Jul 22 15:51:29 UTC 2014
> You'll want to use max-cache-size to enforce a hard limit on the size
> of your cache.
> http://www.zytrax.com/books/dns/ch7/hkpng.html#max-cache-size
>
> /Tim
>
> ---
> Tim Krzywonos
> e:: tim at krzywonos.ca
Thanks for reminding me of that. Now that I have some confidence
that the problem is the cache and not some funny memory leak, I
think I can rely on setting a limit. I still find it strange that
when I dumped the cache, the resulting file was only 6 MB in size.
The process size had grown to 257 MB, up form an initial size of
36 MB. It does not make sense
> On 2014-07-21 10:57, schulz at adi.com wrote:
>>>> Have you tried an rndc flush? You can also dump the contents of
>>> the
>>>> cache to find the (approximate) size of the cache. If related to
>>> cache,
>>>> you can tweak parameters to cache, most namely max-cache-size.
>>> IIRC,
>>>> the cache doesn't have a size limit by default.
>>>>
>>>> /Tim
>>>>
>>> I did an rndc dumpdb -cache and the size of the named_dump.db that
>>> resulted is 5927042. Not all that big condidering how it is
>>> formatted.
>>> Late last night I did a rndc flush. At that time the size of named
>>> was 31305 pages of 8192 bytes. As of now (13 hours later) the size
>>> is still 31305. I will see what happens.
>>
>> See below for our named.conf and then my original description of the
>> problem.
>>
>> As of this morning (3 days 12 hours later) named is still at 31305
>> pages.
>> So it appears that the continuous growth that I was seeing is due to
>> the
>> cache.
>>
>> Unfortunately my investigation has not been very methodical. I should
>> have noted the size of the named process when I was getting the out
>> of
>> memory errors. I also should have noted the rate of growth of
>> 9.9.5-P1
>> before trying 9.9.6b1. I am going to switch back to 9.9.5-P1 for a
>> few
>> days and see if the rate of growth is about the same or if it is much
>> worse. (The initial size was 4734 pages and jumped to 7666 within
>> 5 minutes). Assuming that the cache cleaning is working correctly, it
>> may be that a 32 bit process is just not viable these days. I have
>> now
>> built a 64 bit named and will switch to that in a few days.
>>
>> A big problem is that I will be going on vacation at the end of the
>> week
>> and I really want to make sure that named does not shut down while I
>> am
>> away. There is really not enough time to do enough testing to make
>> sure of that. I may set up a cron job to do a daily rndc flush while
>> I
>> am away.
>>
>>>
>>> I was asked off list for our named.conf. Here it is.
>>> options {
>>> directory "/var/named";
>>> acache-enable yes;
>>> auth-nxdomain no;
>>> transfer-format many-answers;
>>> dnssec-enable yes;
>>> dnssec-validation yes;
>>> dnssec-lookaside auto;
>>> };
>>> managed-keys {
>>> dlv.isc.org. initial-key 257 3 5 .....;
>>> };
>>> managed-keys {
>>> "." initial-key 257 3 8 .......;
>>> };
>>>
>>> view "internal" {
>>> match-clients { !192.168.3.95; !192.168.3.150;
>>> !192.168.4.0/24; localnets;
>>> };
>>> sortlist {
>>> { 192.168.2.0/24; { 192.168.2.0/24; 192.168.3.0/24;
>>> }; };
>>> { 192.168.3.0/24; { 192.168.3.0/24; 192.168.2.0/24;
>>> }; };
>>> };
>>> zone "." {
>>> type hint;
>>> file "named.root";
>>> };
>>>
>>> zone "adi.com" {
>>> type master;
>>> file "adi.com.hosts.int";
>>> check-names ignore;
>>> notify explicit;
>>> also-notify {
>>> 192.168.2.95;
>>> 192.168.2.150;
>>> };
>>> };
>>>
>>> zone "130-157.245.100.75.in-addr.arpa" {
>>> type master;
>>> file "75.100.245.130-157.revhosts";
>>> notify explicit;
>>> also-notify {
>>> 192.168.2.95;
>>> 192.168.2.150;
>>> };
>>> };
>>>
>>> zone "2.168.192.in-addr.arpa" {
>>> type master;
>>> file "192.168.2.revhosts.int";
>>> notify explicit;
>>> also-notify {
>>> 192.168.2.95;
>>> 192.168.2.150;
>>> };
>>> };
>>>
>>> zone "3.168.192.in-addr.arpa" {
>>> type master;
>>> file "192.168.3.revhosts.int";
>>> notify explicit;
>>> also-notify {
>>> 192.168.2.95;
>>> 192.168.2.150;
>>> };
>>> };
>>>
>>> zone "4.168.192.in-addr.arpa" {
>>> type master;
>>> file "192.168.2.revhosts.int";
>>> notify explicit;
>>> also-notify {
>>> 192.168.2.95;
>>> 192.168.2.150;
>>> };
>>> };
>>>
>>> zone "localhost" {
>>> type master;
>>> notify no;
>>> file "named.local";
>>> };
>>>
>>> zone "0.0.127.in-addr.arpa" {
>>> type master;
>>> notify no;
>>> file "named.revlocal";
>>> };
>>>
>>> zone "com" {
>>> type delegation-only;
>>> };
>>>
>>> zone "net" {
>>> type delegation-only;
>>> };
>>> };
>>>
>>> view "internal4" {
>>> match-clients { 192.168.4.0/24; };
>>> zone "." {
>>> type hint;
>>> file "named.root";
>>> };
>>>
>>> zone "adi.com" {
>>> type master;
>>> file "adi.com.hosts.int4";
>>> check-names ignore;
>>> notify explicit;
>>> also-notify {
>>> 192.168.4.95;
>>> 192.168.4.150;
>>> };
>>> };
>>>
>>> zone "130-157.245.100.75.in-addr.arpa" {
>>> type master;
>>> file "75.100.245.130-157.revhosts";
>>> notify explicit;
>>> also-notify {
>>> 192.168.4.95;
>>> 192.168.4.150;
>>> };
>>> };
>>>
>>> zone "2.168.192.in-addr.arpa" {
>>> type master;
>>> file "192.168.2.revhosts.int";
>>> notify explicit;
>>> also-notify {
>>> 192.168.4.95;
>>> 192.168.4.150;
>>> };
>>> };
>>>
>>> zone "3.168.192.in-addr.arpa" {
>>> type master;
>>> file "192.168.3.revhosts.int";
>>> notify explicit;
>>> also-notify {
>>> 192.168.4.95;
>>> 192.168.4.150;
>>> };
>>> };
>>>
>>> zone "4.168.192.in-addr.arpa" {
>>> type master;
>>> file "192.168.2.revhosts.int";
>>> notify explicit;
>>> also-notify {
>>> 192.168.4.95;
>>> 192.168.4.150;
>>> };
>>> };
>>>
>>> zone "localhost" {
>>> type master;
>>> notify no;
>>> file "named.local";
>>> };
>>>
>>> zone "0.0.127.in-addr.arpa" {
>>> type master;
>>> notify no;
>>> file "named.revlocal";
>>> };
>>>
>>> zone "com" {
>>> type delegation-only;
>>> };
>>>
>>> zone "net" {
>>> type delegation-only;
>>> };
>>> };
>>>
>>> view "external" {
>>> match-clients { any; };
>>> allow-recursion { 75.100.245.0/24; };
>>> zone "." {
>>> type hint;
>>> file "named.root";
>>> };
>>>
>>> zone "adi.com" {
>>> type master;
>>> file "adi.com.hosts.ext";
>>> inline-signing yes;
>>> key-directory "dnssec";
>>> auto-dnssec maintain;
>>> also-notify {
>>> 192.168.3.95;
>>> 192.168.3.150;
>>> 216.170.230.22;
>>> };
>>> };
>>>
>>> zone "130-157.245.100.75.in-addr.arpa" {
>>> type master;
>>> file "75.100.245.130-157.revhosts";
>>> notify explicit;
>>> also-notify {
>>> 192.168.2.95;
>>> 192.168.2.150;
>>> 216.170.230.22;
>>> };
>>> };
>>>
>>> zone "com" {
>>> type delegation-only;
>>> };
>>>
>>> zone "net" {
>>> type delegation-only;
>>> };
>>> };
>>>
>>>>
>>>> On 2014-07-17 10:39, schulz at adi.com wrote:
>>>>> We are running Bind on a Sun Sparc machine running Solairs 8.
>>> Bind is
>>>>> built as a 32 bit executable as that is the default and is the
>>> way
>>>>> libcrypto and libxml2 are built. We have been running Bind
>>> 9.9.5.
>>>>> I am now trying Bind 9.9.6b1 as that claims to have fixed some
>>> memory
>>>>> leaks.
>>>>>
>>>>> For some time now Bind has stopped being able to do recursive
>>> queries
>>>>> every couple of weeks and I have been just restarting it. I
>>> decided
>>>>> to
>>>>> look into this and found it logging out of memory errors. This
>>> seems
>>>>> to
>>>>> have started happening after I set up bind to sign our domain,
>>>>> adi.com.
>>>>> The server is bluegill.adi.com. It is set up with 3 views. Two
>>> are
>>>>> internal
>>>>> views and can do recursive queries. One is the external view and
>>> does
>>>>> not allow recursive queries.
>>>>>
>>>>> Since restarting named, this time Bind 9.9.6b1, I have been
>>> checking
>>>>> the memory usage every day. The usage in pages of 8192 bytes for
>>> the
>>>>> last 7 days are:
>>>>> 16,517 19,221 20,111 23,707 24,957 26,384 28,231 29,912
>>>>>
>>>>> Note that this shows no signs of settling down. I am looking
>>> into
>>>>> the possability of rebuilding Bind as a 64 bit executable as
>>> that
>>>>> should take much longer to run out of memory.
>>>>>
>>>>> A recient section of the log showing that the cleaner is
>>> running:
>>>>> Jul 17 10:24:44 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 91e6a30 stats: hits=0 misses=6 queries=6 adds=6 deleted=5
>>>>> cleaned=5 cleaner_runs=140 overmem=0 overmem_nocreates=0 nomem=0
>>>>> Jul 17 10:24:44 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 91e6a30 cleaning interval set to 3600.
>>>>> Jul 17 10:24:44 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 933f990 stats: hits=3299 misses=79 queries=3378 adds=86
>>>>> deleted=370
>>>>> cleaned=370 cleaner_runs=144 overmem=0 overmem_nocreates=0
>>> nomem=0
>>>>> Jul 17 10:24:44 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 933f990 cleaning interval set to 3600.
>>>>> Jul 17 10:24:46 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 9166a20 stats: hits=76514 misses=4348 queries=80862
>>> adds=4348
>>>>> deleted=3717 cleaned=3717 cleaner_runs=144 overmem=0
>>>>> overmem_nocreates=0
>>>>> nomem=0
>>>>> Jul 17 10:24:46 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 9166a20 cleaning interval set to 3600.
>>>>> Jul 17 10:29:51 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> clients-per-query decreased to 10
Tom Schulz
Applied Dynamics Intl.
schulz at adi.com
More information about the bind-users
mailing list