retry limit exceeded / possible network problem?

Alex mysqlstudent at gmail.com
Wed Mar 23 19:43:45 UTC 2016


Hi,

I have a fedora23 system with bind-9.10.3 that's been running fine for
a long time. For some reason this morning, queries started timing out.
This is a mail server, so queries to spamhaus, barracuda, etc, started
timing out with:

Mar 23 14:46:57 mail03 postfix/postscreen[12635]: warning: dnsblog
reply timeout 10s for mykey.zen.dq.spamhaus.net

where 'mykey' is the key assigned to me for the service. (this isn't a
"query volume reached" kind of error).

It's almost like there's a firewall blocking outbound access, but
that's not the case. Sometimes queries work, sometimes they timeout:

# host google.com
;; connection timed out; no servers could be reached

Trying the same command again, and it might work. Here's an example
with messagelabs:

# host messagelabs.com
;; connection timed out; no servers could be reached
# host messagelabs.com
messagelabs.com has address 216.12.145.20
messagelabs.com has address 155.64.49.54
;; connection timed out; no servers could be reached
# host messagelabs.com 8.8.4.4
Using domain server:
Name: 8.8.4.4
Address: 8.8.4.4#53
Aliases:

messagelabs.com has address 216.12.145.20
messagelabs.com has address 155.64.49.54
messagelabs.com mail is handled by 10 cluster6.eu.messagelabs.com.
messagelabs.com mail is handled by 20 cluster6a.eu.messagelabs.com.

It does appear to work reliably when using google's nameservers.

Just running "dig" returns all the forward entries for the top-level
servers, but not the reverse. My hints file does have both, however.

Then I noticed these in the bind logs:

23-Mar-2016 15:12:10.603 general: info: zone sbl.example.com/IN:
refresh: retry limit for master 64.11.16.5#53 exceeded (source
0.0.0.0#0)
23-Mar-2016 15:12:10.603 general: info: zone sbl.example.com/IN:
Transfer started.
23-Mar-2016 15:12:10.615 xfer-in: info: transfer of
'sbl.example.com/IN' from 64.11.16.5#53: connected using
68.193.193.45#39699
23-Mar-2016 15:12:10.627 xfer-in: info: transfer of
'sbl.example.com/IN' from 64.11.16.5#53: Transfer status: up to date
23-Mar-2016 15:12:10.627 xfer-in: info: transfer of
'sbl.example.com/IN' from 64.11.16.5#53: Transfer completed: 0
messages, 1 records, 0 bytes, 0.012 secs (0 bytes/sec)

where 'example.com' is my domain. A little googling shows this is the
result of the UDP transfer failing, then falling back to TCP.

This system is running on a Cablevision/Optonline business-class cable
connection. They've said the circuit is operating normally. Could this
still be some kind of network issue? There are no local errors on the
interface, and I've rebooted their modem and even replaced the network
cable.

Perhaps you know of a tcpdump option where I can look for network
retries or some type of packet retransmission/errors?

I'm really stuck, and the mail server isn't functioning while I figure
this out, so any help greatly appreciated.

I've included my named.conf but it was working fine yesterday:

acl "trusted" {
        { 127/8; };
        { 192.168.1.0/24; };
        { 23.224.183.206; };
        { 68.193.193.45; };
};
options {
        listen-on port 53 { 127.0.0.1; 68.193.193.45; };
        // listen-on-v6 port 53 { ::1; };
        listen-on-v6 port 53 { none; };
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named.stats";         // _PATH_STATS
        memstatistics-file "/var/named/data/named.memstats";   // _PATH_MEMSTATS
        allow-query     { trusted; };
        notify master-only;
        recursive-clients 5000;
        /*
         - If you are building an AUTHORITATIVE DNS server, do NOT
enable recursion.
         - If you are building a RECURSIVE (caching) DNS server, you
need to enable
           recursion.
         - If your recursive DNS server has a public IP address, you
MUST enable access
           control to limit queries to your legitimate users. Failing
to do so will
           cause your server to become part of large scale DNS amplification
           attacks. Implementing BCP38 within your network would greatly
           reduce such attack surface
        */
        // recursion yes;
        allow-recursion { trusted; };
        dnssec-enable yes;
        dnssec-validation yes;
        dnssec-lookaside auto;
        /* Path to ISC DLV key */
        bindkeys-file "/etc/named.iscdlv.key";
        managed-keys-directory "/var/named/dynamic";
        pid-file "/run/named/named.pid";
        session-keyfile "/run/named/session.key";
};
logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
        // Record all queries to the box for now
        channel query_info {
           severity info;
           file "/var/log/named.query.log" versions 3 size 10m;
           print-time yes;
           print-category yes;
         };
        // added for fail2ban support
        channel security_file {
           severity dynamic;
           file "/var/log/named.security.log" versions 3 size 30m;
           print-time yes;
           print-category yes;
        };
        channel b_debug {
                file "/var/log/named.debug.log" versions 2 size 10m;
                print-time yes;
                print-category yes;
                print-severity yes;
                severity dynamic;
        };
        // Send the security related messages to a separate file.
        channel audit_log {
                file "/var/log/named.audit.log" versions 4 size 10m;
                severity info;
                print-time yes;
                print-category yes;
        };
        category queries { query_info; };
        category default { b_debug; };
        category config { b_debug; };
        category security { security_file; };
        category lame-servers { null; };
};
zone "." IN {
        type hint;
        file "/var/named/named.ca";
};
zone "sbl.example.com" {
        type slave;
        file "slaves/db.sbl.example.com";
        masters { 64.11.16.5; };
        allow-query { trusted; };
        allow-transfer { trusted; };
};
include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";
include "/etc/rndc.key";

Thanks,
Alex


More information about the bind-users mailing list