DNSTAP output file rolling trouble in BIND 9.12.0rc1

Jay Ford jnford at uiowa.net
Tue Jan 2 20:00:52 UTC 2018


I'm having some odd trouble with DNSTAP output file rolling in BIND 
9.12.0rc1.

I have named built like:
    BIND 9.12.0rc1 <id:f9c3aba>
    running on Linux x86_64 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-1 (2016-03-06)
    built by make with 'STD_CDEFINES=-DISC_FACILITY=LOG_LOCAL5' '--libdir=/usr/lib/x86_64-linux-gnu' '--with-openssl' '--enable-dnstap' '--enable-fixed-rrset' '--disable-openssl-version-check' '--with-libtool' '--enable-dnsrps'
    compiled by GCC 6.3.0 20170516
    compiled with OpenSSL version: OpenSSL 1.1.0f  25 May 2017
    linked to OpenSSL version: OpenSSL 1.1.0f  25 May 2017
    compiled with libxml2 version: 2.9.4
    linked to libxml2 version: 20904
    threads support is enabled

I have DNSTAP configured like:
    dnstap {
       client query;
    };
    dnstap-output file "tmp/dnstap.out" versions 10 size 10m;

It mostly works as expected, except that named:
    o  logs twice about rolling the file every time, such as:
          Jan  2 05:15:42 named[24758]: dnstap: info: rolling dnstap
             destination 'tmp/dnstap.out'
          Jan  2 05:15:42 named[24758]: dnstap: info: rolling dnstap
             destination 'tmp/dnstap.out'
    o  sometimes crashes after logging that, possibly after rolling the file
    o  writes to multiple output files simultaneously, such as:
          ls -lt dnstap* | head -2
          -rw-r--r-- 1 bind bind  1282048 Jan  2 16:24 dnstap.out
          -rw-r--r-- 1 bind bind  1273856 Jan  2 16:24 dnstap.out.0
       & 2 minutes later:
          ls -lt dnstap* | head -2
          -rw-r--r-- 1 bind bind  1286144 Jan  2 16:26 dnstap.out
          -rw-r--r-- 1 bind bind  1277952 Jan  2 16:26 dnstap.out.0

This system had 4 worker threads in use.  Another similar system with only 1 
thread does not have such trouble, which got me wondering about problems with 
threads & DNSTAP, specifically output file rolling.  Reducing the threads on 
the afflicted system (via named option "-n 1") seems to avoid the problem, 
but it's a little early to tell, & it's not a desirable fix.

I'd appreciate it if somebody who knows the code would comment on the threads 
vs DNSTAP possibility or point me in some other direction to figure this out.

I have a named core file & can provide more config... details if required.

________________________________________________________________________
Jay Ford, Network Engineering Group, Information Technology Services
University of Iowa, Iowa City, IA 52242


More information about the bind-users mailing list