BIND Process failed during logrotate

White, Peter pwhite at penguinrandomhouse.com
Wed Mar 22 13:40:51 UTC 2023


I had the named process fail this past weekend on two secondaries running BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.13. It seems that logrotate.d is calling the following script at the time of the failure.

/var/named/data/named.run {
    missingok
    su named named
    create 0644 named named
    postrotate
        /usr/bin/systemctl reload named.service > /dev/null 2>&1 || true
        /usr/bin/systemctl reload named-chroot.service > /dev/null 2>&1 || true
        /usr/bin/systemctl reload named-sdb.service > /dev/null 2>&1 || true
        /usr/bin/systemctl reload named-sdb-chroot.service > /dev/null 2>&1 || true
        /usr/bin/systemctl reload named-pkcs11.service > /dev/null 2>&1 || true
    endscript
}

First of all, is this script part of the normal BIND distribution, or is it part of the RHEL 7 distribution? From what I can tell, it is called weekly.

Poring through the BIND logs for the cause of the failure, I came across this. Note the server.c:2948 error message and subsequent failure.
19-Mar-2023 03:46:01.908 received control channel command 'reload'
19-Mar-2023 03:46:01.908 loading configuration from '/etc/named.conf'
19-Mar-2023 03:46:01.909 reading built-in trust anchors from file '/etc/named.root.key'
19-Mar-2023 03:46:01.909 GeoIP Country (IPv4) (type 1) DB not available
19-Mar-2023 03:46:01.909 GeoIP Country (IPv6) (type 12) DB not available
19-Mar-2023 03:46:01.909 GeoIP City (IPv4) (type 2) DB not available
19-Mar-2023 03:46:01.909 GeoIP City (IPv4) (type 6) DB not available
19-Mar-2023 03:46:01.909 GeoIP City (IPv6) (type 30) DB not available
19-Mar-2023 03:46:01.909 GeoIP City (IPv6) (type 31) DB not available
19-Mar-2023 03:46:01.909 GeoIP Region (type 3) DB not available
19-Mar-2023 03:46:01.909 GeoIP Region (type 7) DB not available
19-Mar-2023 03:46:01.909 GeoIP ISP (type 4) DB not available
19-Mar-2023 03:46:01.909 GeoIP Org (type 5) DB not available
19-Mar-2023 03:46:01.909 GeoIP AS (type 9) DB not available
19-Mar-2023 03:46:01.909 GeoIP Domain (type 11) DB not available
19-Mar-2023 03:46:01.909 GeoIP NetSpeed (type 10) DB not available
19-Mar-2023 03:46:01.909 using default UDP/IPv4 port range: [1024, 65535]
19-Mar-2023 03:46:01.909 using default UDP/IPv6 port range: [1024, 65535]
19-Mar-2023 03:46:01.910 sizing zone task pool based on 2 zones
19-Mar-2023 03:46:01.911 ../../../bin/named/server.c:2498: fatal error:
19-Mar-2023 03:46:01.911 RUNTIME_CHECK(tresult == 0) failed
19-Mar-2023 03:46:01.911 exiting (due to fatal error in library)
Looking back a week earlier when the script last run, that server.c error was not there.

Any thoughts on what could have caused this on two secondaries? The primary reloaded around the same time without incident.

Thanks for your assistance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20230322/6f4c363f/attachment-0001.htm>


More information about the bind-users mailing list