BIND - out of memory

Jan Arild Lindstrøm jal at telenor.net
Thu Mar 26 08:13:25 UTC 2009


Hi,

I allready tried 9.4.3, and it happened there.

Trying 9.6.0-P1 gives the same result:

        recursive clients: 1083/49900/50000
        
        --cut--
        26-Mar-2009 08:04:39.736 database: adb: fetch of 'mars.csd.unb.ca' A failed: out of memory
        26-Mar-2009 08:04:39.737 database: adb: fetch of 'dns.guangzhou.gd.cn' A failed: out of memory
        --cut--

        nsXX(root) / 502# plimit 9474
        9474:   /local/named-new/local/sbin/named -f -c /etc/named/named.conf.new -u n
           resource              current         maximum
          time(seconds)         unlimited       unlimited
          file(blocks)          unlimited       unlimited
          data(kbytes)          unlimited       unlimited
          stack(kbytes)         unlimited       unlimited
          coredump(blocks)      unlimited       unlimited
          nofiles(descriptors)  unlimited       unlimited
          vmemory(kbytes)       unlimited       unlimited

        tcp-clients 5000;
        clients-per-query 2500;
        max-clients-per-query 5000;
        recursive-clients 50000;

        (Big numbers to be sure they are not the reason for the 1000 limit.)

9.6.1b1 info:
        file local/named-new/local/sbin/named:       
        ELF 64-bit MSB executable SPARCV9 Version 1, UltraSPARC3 Extensions Required, dynamically linked, not stripped

        Sun Studio Express 3/09 (-xtarget=ultraT2plus -m64):
        nsXX(root) named-new 543# /local/named-new/local/sbin/named -V
        BIND 9.6.1b1 built with '--prefix=/local' '--localstatedir=/var' '--with-openssl=/local/openssl' '--with-randomdev=/dev/urandom'        
        '--enable-threads' '--with-libtool' '--enable-static=yes' '--disable-shared' '--sysconfdir=/etc/named' 'CC=/opt/StudioExpress/SSX0903/bin/cc' 
        'CFLAGS= -xtarget=ultraT2plus -m64' 'LDFLAGS= -xtarget=ultraT2plus -m64' 'CPPFLAGS= -xtarget=ultraT2plus -m64' 
        'CXX=/opt/StudioExpress/SSX0903/bin/CC' 'CXXFLAGS= -xtarget=ultraT2plus -m64'

        SunOS nsXX.xxx.xx 5.10 Generic_138888-01 sun4v sparc SUNW,T5140 Solaris

BIND 9.4.3
        Sun Studio 12: -fast -xtarget=ultraT1 -m64
BIND 9.6.0-P1:
        Sun Studio Express 11/08: -fast -xtarget=native64

I tried it on another server also, and same thing happens: As soon as recursive clients pass
1000/default, "out of memory" messages start to flood the log.

Anyway:
        During peak of the day our prod servers are around 700 recursive clients, so we are not 
        affected by the 1000/default "limit" yet. But if we flush the cache we will have a problem 
        (as always), since then suddenly there is +8000 recursive clients there within one second, 
        and the number just grows until 50000, and the server is jammed. Hence we try very hard 
        to not flush the whole cache or restart BIND. I have asked before about that problem, but
        no solution have been found. But perhaps now, the 1000/default "limit" and out-of-memory
        might be the/one of the reasons for this.


Regards
Jan Arild Lindstrom


At 22:41 25/03/2009, Doug Barton wrote:
>Jan Arild Lindstrøm wrote:
>> Hi,
>> 
>> more findings ...
>> 
>> BIND 9.6.1b1
>> 
>> No matter what I set in named.conf, it starts to give "out of memory" when recursive
>> clients pass 1000. I see that 1000 is the default value for recursive-clients.
>
>Did you try backing up to 9.6.0-P1 to see if the same behavior exists
>there?
>
>
>Doug




More information about the bind-users mailing list