Frequent timeout
John W. Blue
john.blue at rrcic.com
Thu Sep 6 19:04:38 UTC 2018
Alex,
Have you uploaded this pcap with the SERVFAIL's? I didn't have time to look at your first upload but can review this one.
John
-----Original Message-----
From: bind-users [mailto:bind-users-bounces at lists.isc.org] On Behalf Of Alex
Sent: Thursday, September 06, 2018 1:49 PM
To: carl at byington.org; bind-users at lists.isc.org
Subject: Re: Frequent timeout
Hi,
On Mon, Sep 3, 2018 at 12:45 PM Carl Byington <carl at byington.org> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> On Sun, 2018-09-02 at 21:54 -0400, Alex wrote:
> > Do you have any other ideas on how I can isolate this problem?
>
> Run tcpdump on the external ethernet connection.
>
> tcpdump -s0 -vv -i %s -nn -w /tmp/outputfile udp dst port domain
I've captured some packets that I believe include the packets relating to the SERVFAIL errors I've been receiving. Now I have to figure out how to go through them.
In the meantime, I've configured /etc/resolv.conf to send queries to a remote system of ours, and the errors have (mostly) stopped.
I also notice some traces take an abnormal amount of time. Ping times to google.com are less than 20ms, but this trace shows reaching the root servers takes 104ms:
# dig +trace +nodnssec google.com
; <<>> DiG 9.11.4-P1-RedHat-9.11.4-5.P1.fc28 <<>> +trace +nodnssec google.com ;; global options: +cmd
. 3451 IN NS g.root-servers.net.
. 3451 IN NS k.root-servers.net.
. 3451 IN NS j.root-servers.net.
. 3451 IN NS c.root-servers.net.
. 3451 IN NS i.root-servers.net.
. 3451 IN NS e.root-servers.net.
. 3451 IN NS m.root-servers.net.
. 3451 IN NS l.root-servers.net.
. 3451 IN NS a.root-servers.net.
. 3451 IN NS h.root-servers.net.
. 3451 IN NS b.root-servers.net.
. 3451 IN NS d.root-servers.net.
. 3451 IN NS f.root-servers.net.
;; Received 839 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms
com. 172800 IN NS h.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
com. 172800 IN NS m.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS k.gtld-servers.net.
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
;; Received 835 bytes from 202.12.27.33#53(m.root-servers.net) in 104 ms
google.com. 172800 IN NS ns2.google.com.
google.com. 172800 IN NS ns1.google.com.
google.com. 172800 IN NS ns3.google.com.
google.com. 172800 IN NS ns4.google.com.
;; Received 287 bytes from 192.33.14.30#53(b.gtld-servers.net) in 44 ms
;; expected opt record in response
google.com. 300 IN A 172.217.10.14
;; Received 44 bytes from 216.239.36.10#53(ns3.google.com) in 29 ms
Running the same trace again showed 129ms.
I also located this warning:
06-Sep-2018 12:03:33.304 client: warning: client @0x7f502c1d3d50
127.0.0.1#60968 (cmail20.com.multi.surbl.org): recursive-clients soft limit exceeded (901/900/1000), aborting oldest query
I've increased recursive-clients to 2500 but the SERVFAIL errors continue.
There are also a ton of lame-server entries, many of which are related to one RBL or another, as part of my postscreen config:
06-Sep-2018 13:16:50.686 lame-servers: info: connection refused resolving '48.167.85.209.zz.countries.nerd.dk/A/IN': 195.182.36.121#53
06-Sep-2018 13:16:50.706 lame-servers: info: connection refused resolving '48.167.85.209.bb.barracudacentral.org/A/IN':
64.235.154.72#53
06-Sep-2018 13:16:51.308 lame-servers: info: connection refused resolving '48.167.85.209.bl.blocklist.de/A/IN': 185.21.103.31#53
06-Sep-2018 13:16:54.798 lame-servers: info: connection refused resolving 'e51dd24f684d212a7da1119b23603b0f.generic.ixhash.net/A/IN':
178.254.39.16#53
06-Sep-2018 13:16:54.799 lame-servers: info: connection refused resolving 'f4d997d8949e6dbd30f6a418ad364589.generic.ixhash.net/A/IN':
178.254.39.16#53
06-Sep-2018 13:16:55.762 lame-servers: info: connection refused resolving '2.164.177.209.bb.barracudacentral.org/A/IN':
64.235.145.15#53
06-Sep-2018 13:16:55.845 lame-servers: info: connection refused resolving '2.164.177.209.bb.barracudacentral.org/A/IN':
64.235.154.72#53
What would be a cause of such a significant delay in reaching the root servers?
Thanks,
Alex
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list
bind-users mailing list
bind-users at lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users
More information about the bind-users
mailing list