random dispatch error after reconfig with bind ESV 9.11

Laurent Frigault lolo at troll.free.org
Wed Jun 24 12:03:32 UTC 2020


On Wed, Jun 24, 2020 at 01:11:25PM +0200, Laurent Frigault wrote:
> Hi,
> 
> OS: FreeBSD 12.1 / AMD64
> 56 cores CPU
> 64G RAM
> 
> bind ESV 9.11.19 from standard freebsd pkg
> 
> We are runng an hiddenmaster bind server with DNSSEC zone "bookmyname.be" {

Ooops, 

I wanted to write :

We are runing an hidden master bind server with DNSSEC zones all LIKE
this one:

> zone "bookmyname.be" {
>     type master;
>     file "custom/b/o/bookmyname.be/bookmyname.be";
>     notify explicit;
>     also-notify { 213.36.252.135; 62.210.98.15; 213.36.253.14; };
>     auto-dnssec maintain;
>     inline-signing yes;
>     key-directory "custom/b/o/bookmyname.be";
> };
> 
> We have about 73000 zones , most signed.
> 
> We periodically regenerate our configuration to add/update/remove zones.
> 
> when needed, we use "rndc reconfig"
> 
> Every few weeks we get the following error in the log  :
> 
> Jun 18 11:02:41 nsmaster named[50196]: 18-Jun-2020 11:02:41.989 general: error: ./server.c:3881: unexpected error:
> Jun 18 11:02:41 nsmaster named[50196]: 18-Jun-2020 11:02:41.989 general: error: unable to obtain neither an IPv4 nor an IPv6 dispatch
> Jun 18 11:02:42 nsmaster named[50196]: 18-Jun-2020 11:02:42.728 general: error: reloading configuration failed: unexpected error
> 
> And after this, it stops responding for some zone but not all of them
> and not always the recently added.
> 
> Of course the  ipv4 and v6 where not changed/added/remove  on the server and th configuration was correct it is generated by a script from a database).
> 
> After this, the only solution is to stop and restart bind. An other rndc reconfig produce the same error.
> 
> Someone tells me this may be a socket issue.
> The freebsd pkg/port is build with --with-tuning=default
> 
> % /usr/local/sbin/named -V
> BIND 9.11.19 (Extended Support Version) <id:905ec64>
> running on FreeBSD amd64 12.1-RELEASE-p6 FreeBSD 12.1-RELEASE-p6 GENERIC
> built by make with '--localstatedir=/var' '--disable-linux-caps' '--with-randomdev=/dev/random' '--with-libxml2=/usr/local' '--with-readline=-L/usr/local/lib -ledit' '--with-dlopen=yes' '--with-gost=no' '--without-python' '--sysconfdir=/usr/local/etc/namedb' '--with-dlz-filesystem=yes' '--disable-dnstap' '--enable-filter-aaaa' '--disable-fixed-rrset' '--without-geoip2' '--without-gssapi' '--with-libidn2=/usr/local' '--enable-ipv6' '--with-libjson=/usr/local' '--disable-largefile' '--with-lmdb=/usr/local' '--disable-native-pkcs11' '--disable-querytrace' '--enable-rpz-nsdname' '--enable-rpz-nsip' 'STD_CDEFINES=-DDIG_SIGCHASE=1' '--with-openssl=/usr' '--enable-threads' '--with-tuning=default' '--disable-symtable' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd12.1' 'build_alias=amd64-portbld-freebsd12.1' 'CC=cc' 'CFLAGS=-O2 -pipe -DLIBICONV_PLUG -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing ' '
>  LDFLAGS= -fstack-protector-strong ' 'LIBS=-L/usr/local/lib' 'CPPFLAGS=-DLIBICONV_PLUG -isystem /usr/local/include' 'CPP=cpp' 'PKG_CONFIG=pkgconf'
> compiled by CLANG 4.2.1 Compatible FreeBSD Clang 8.0.1 (tags/RELEASE_801/final 366581)
> compiled with OpenSSL version: OpenSSL 1.1.1d-freebsd  10 Sep 2019
> linked to OpenSSL version: OpenSSL 1.1.1d-freebsd  10 Sep 2019
> compiled with libxml2 version: 2.9.10
> linked to libxml2 version: 20910
> compiled with libjson-c version: 0.13.1
> linked to libjson-c version: 0.13.1
> compiled with zlib version: 1.2.11
> linked to zlib version: 1.2.11
> threads support is enabled
> 
> default paths:
>   named configuration:  /usr/local/etc/namedb/named.conf
>   rndc configuration:   /usr/local/etc/namedb/rndc.conf
>   DNSSEC root key:      /usr/local/etc/namedb/bind.keys
>   nsupdate session key: /var/run/named/session.key
>   named PID file:       /var/run/named/pid
>   named lock file:      /var/run/named/named.lock
> 
> 
> I upgraded bind to 9.11.20, and restarted it with -U 21000 .
> After looking at the source, I added query-source and query-source-v6 in
> the configuration even if the server has "recursion no;" and has no slave.
> 
> Today, same problem:
> 
> Jun 24 08:52:39 nsmaster named[46662]: general: error: could not get query source dispatcher (213.36.252.194#0)
> Jun 24 08:52:39 nsmaster named[46662]: general: error: reloading configuration failed: out of memory
> 
> top show no memory issue:
> 
> last pid: 16247;  load averages:  1.08,  1.80,  6.80   up 154+20:15:48 13:02:17
> 40 processes:  1 running, 39 sleeping
> CPU:  1.6% user,  0.0% nice,  0.5% system,  0.1% interrupt, 97.8% idle
> Mem: 4568M Active, 4978M Inact, 32G Wired, 580M Buf, 21G Free
> ARC: 16G Total, 8811M MFU, 5160M MRU, 20M Anon, 471M Header, 1751M Other
>      12G Compressed, 38G Uncompressed, 3.31:1 Ratio
> Swap: 64G Total, 64G Free
> 
> ...
> 14164 bind         59  52    0  5320M  5200M sigwai  32  21.1H   0.31% named
> ...
> 
> % limits
> Resource limits (current):
>   cputime              infinity secs
>   filesize             infinity kB
>   datasize             33554432 kB
>   stacksize              524288 kB
>   coredumpsize         infinity kB
>   memoryuse            infinity kB
>   memorylocked               64 kB
>   maxprocesses            63709
>   openfiles             1883583
>   sbsize               infinity bytes
>   vmemoryuse           infinity kB
>   pseudo-terminals     infinity
>   swapuse              infinity kB
>   kqueues              infinity
>   umtxp                infinity
> 
> # rndc status
> version: BIND 9.11.20 (Extended Support Version) <id:f3d1d66>
> running on nsmaster.free.org: FreeBSD amd64 12.1-RELEASE-p1 FreeBSD 12.1-RELEASE-p1 GENERIC
> boot time: Wed, 24 Jun 2020 08:11:43 GMT
> last configured: Wed, 24 Jun 2020 11:02:40 GMT
> configuration file: /usr/local/etc/namedb/named-custom.conf
> CPUs found: 56
> worker threads: 56
> UDP listeners per interface: 20
> number of zones: 144218 (0 automatic)
> debug level: 0
> xfers running: 0
> xfers deferred: 0
> soa queries in progress: 0
> query logging is ON
> recursive clients: 0/900/1000
> tcp clients: 6/1000
> TCP high-water: 40
> server is up and running
> 
> We have only about 70000 zones but bind semmes to count them twice when
> they are signed.
> 
> I have rebuild bind with tuning=large :
> 
> # /usr/local/sbin/named -V
> BIND 9.11.20 (Extended Support Version) <id:f3d1d66>
> running on FreeBSD amd64 12.1-RELEASE-p1 FreeBSD 12.1-RELEASE-p1 GENERIC
> built by make with '--localstatedir=/var' '--disable-linux-caps' '--with-randomdev=/dev/random' '--with-libxml2=/usr/local' '--with-readline=-L/usr/local/lib -ledit' '--with-dlopen=yes' '--with-gost=no' '--without-python' '--sysconfdir=/usr/local/etc/namedb' '--with-dlz-filesystem=yes' '--disable-dnstap' '--enable-filter-aaaa' '--disable-fixed-rrset' '--without-geoip2' '--without-gssapi' '--with-libidn2=/usr/local' '--enable-ipv6' '--with-libjson=/usr/local' '--disable-largefile' '--with-lmdb=/usr/local' '--disable-native-pkcs11' '--disable-querytrace' '--enable-rpz-nsdname' '--enable-rpz-nsip' 'STD_CDEFINES=-DDIG_SIGCHASE=1' '--with-openssl=/usr' '--enable-threads' '--with-tuning=large' '--disable-symtable' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/share/info/' '--build=amd64-portbld-freebsd12.1' 'build_alias=amd64-portbld-freebsd12.1' 'CC=cc' 'CFLAGS=-O2 -pipe -DLIBICONV_PLUG -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing ' 'LD
>  FLAGS= -fstack-protector-strong ' 'LIBS=-L/usr/local/lib' 'CPPFLAGS=-DLIBICONV_PLUG -isystem /usr/local/include' 'CPP=cpp' 'PKG_CONFIG=pkgconf'
> compiled by CLANG 4.2.1 Compatible FreeBSD Clang 8.0.1 (tags/RELEASE_801/final 366581)
> compiled with OpenSSL version: OpenSSL 1.1.1d-freebsd  10 Sep 2019
> linked to OpenSSL version: OpenSSL 1.1.1d-freebsd  10 Sep 2019
> compiled with libxml2 version: 2.9.10
> linked to libxml2 version: 20910
> compiled with libjson-c version: 0.13.1
> linked to libjson-c version: 0.13.1
> compiled with zlib version: 1.2.11
> linked to zlib version: 1.2.11
> threads support is enabled
> 
> default paths:
>   named configuration:  /usr/local/etc/namedb/named.conf
>   rndc configuration:   /usr/local/etc/namedb/rndc.conf
>   DNSSEC root key:      /usr/local/etc/namedb/bind.keys
>   nsupdate session key: /var/run/named/session.key
>   named PID file:       /var/run/named/pid
>   named lock file:      /var/run/named/named.lock
> 
> What can I do more to avoid this bug?
> 
> Is there a parameter or build option for such "big" server ?

-- 
Laurent Frigault | Free.org - BookMyName.com - ONLINE SAS - Registar ID 74


More information about the bind-users mailing list