SERVFAIL and peak utilization

Alex mysqlstudent at gmail.com
Fri Jul 27 20:46:38 UTC 2018


Hi, I'm still having a problem and haven't received any replies. Is
there anyone with any ideas on how to troubleshoot this?

What other information can I provide to help troubleshoot this?



On Thu, Jul 26, 2018 at 5:49 PM, Alex <mysqlstudent at gmail.com> wrote:
> Hi, here is some further debugging on what I believe are queries
> involving SERVFAIL:
>
> 26-Jul-2018 17:44:40.168 query-errors: debug 1: client @0x7fbee80f39b0
> 127.0.0.1#61547 (69.248.70.96.bad.psky.me): query failed (SERVFAIL)
> for 69.248.70.96.bad.psky.me/IN/A at ../../../bin/named/query.c:8580
> 26-Jul-2018 17:44:40.168 query-errors: debug 2: fetch completed at
> ../../../lib/dns/resolver.c:3927 for 69.248.70.96.bad.psky.me/A in
> 10.000096: timed out/success
> [domain:psky.me,referral:1,restart:2,qrysent:4,timeout:3,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
> 26-Jul-2018 17:44:40.172 query-errors: debug 1: client @0x7fbed81218a0
> 127.0.0.1#61547 (176.216.85.209.psbl.surriel.com): query failed
> (SERVFAIL) for 176.216.85.209.psbl.surriel.com/IN/A at
> ../../../bin/named/query.c:8580
> 26-Jul-2018 17:44:40.172 query-errors: debug 2: fetch completed at
> ../../../lib/dns/resolver.c:3927 for 176.216.85.209.psbl.surriel.com/A
> in 10.000128: timed out/success
> [domain:psbl.surriel.com,referral:2,restart:1,qrysent:2,timeout:1,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
> 26-Jul-2018 17:44:40.173 query-errors: debug 1: client @0x7fbedc134ed0
> 127.0.0.1#61547 (176.216.85.209.dnsbl-3.uceprotect.net): query failed
> (SERVFAIL) for 176.216.85.209.dnsbl-3.uceprotect.net/IN/A at
> ../../../bin/named/query.c:8580
> 26-Jul-2018 17:44:40.173 query-errors: debug 2: fetch completed at
> ../../../lib/dns/resolver.c:3927 for
> 176.216.85.209.dnsbl-3.uceprotect.net/A in 10.000097: timed
> out/success [domain:dnsbl-3.uceprotect.net,referral:2,restart:1,qrysent:2,timeout:1,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]
>
> There appears to be a few timeout errors. Is this an indication there
> is a performance problem with the cable modem or connection?
>
> Thanks,
> Alex
>
>
> On Thu, Jul 26, 2018 at 1:57 PM, John Miller <johnmill at brandeis.edu> wrote:
>> Hi Alex,
>>
>> What does your query volume look like on this server?  Depending on
>> volume, the BIND defaults for:
>>
>> - clients-per-query
>> - max-clients-per-query
>> - recursive-clients
>> - tcp-clients
>>
>> and others may not be set high enough.  Check pp. 106-108 in the
>> latest 9.11 manual for more details on each of these.
>>
>> Of course, if you're only seeing SERVFAIL for a handful of domains,
>> then they may have some sort of delegation issue, or there might be a
>> network issue between your caching servers and them.
>>
>> John
>>
>>
>> On Thu, Jul 26, 2018 at 1:07 PM, Alex <mysqlstudent at gmail.com> wrote:
>>> Hi,
>>>
>>> I have a bind-9.11.4 server on a fedora28 system and are frequently
>>> seeing SERVFAIL errors like this:
>>>
>>> 26-Jul-2018 12:54:04.255 query-errors: info: client @0x7f764314a5c0
>>> 127.0.0.1#50719 (223.178.102.199.cidr.bl.mcafee.com): query failed
>>> (SERVFAIL) for 223.178.102.199.cidr.bl.mcafee.com/IN/A at
>>> ../../../bin/named/query.c:4140
>>>
>>> I believe this happens more frequently at times of peak link
>>> utilization, but it also appears to happen during normal times.
>>>
>>> This is a local caching server I've set up but it also appears to
>>> exist on other systems that have been set up to be authoritative for
>>> our domain.
>>>
>>> How can I troubleshoot this further?
>>>
>>> Here is the named.conf for this caching server:
>>>
>>> acl "trusted" {
>>>         { 127/8; };
>>>         { 68.195.191.40/29; };
>>>         { 192.168.1.0/24; };
>>>         { 107.155.67.2/32; };
>>> };
>>>
>>> options {
>>> listen-on port 53 { 127.0.0.1; 68.195.191.45; };
>>> listen-on-v6 port 53 { none; };
>>> directory "/var/named";
>>> dump-file "/var/named/data/cache_dump.db";
>>>         statistics-file "/var/named/data/named.stats";         // _PATH_STATS
>>>         memstatistics-file "/var/named/data/named.memstats";   // _PATH_MEMSTATS
>>> allow-query     { trusted; };
>>> recursion yes;
>>> zone-statistics yes;
>>>
>>> // dnssec-enable yes;
>>> // dnssec-validation yes;
>>> // dnssec-lookaside auto;
>>>
>>> dnssec-enable no;
>>> dnssec-validation no;
>>> dnssec-lookaside no;
>>>
>>> /* Path to ISC DLV key */
>>> bindkeys-file "/etc/named.iscdlv.key";
>>>
>>> managed-keys-directory "/var/named/dynamic";
>>>
>>> };
>>>
>>> logging {
>>>         channel default_debug {
>>>                 file "data/named.run";
>>>                 severity dynamic;
>>>         };
>>>
>>>         // Record all queries to the box for now
>>>         channel query_info {
>>>            severity info;
>>>            file "/var/log/named.query.log" versions 3 size 10m;
>>>            print-time yes;
>>>            print-category yes;
>>>          };
>>>
>>>         // added for fail2ban support
>>>         channel security_file {
>>>            severity dynamic;
>>>            file "/var/log/named.security.log" versions 3 size 30m;
>>>            print-time yes;
>>>            print-category yes;
>>>         };
>>>
>>> channel b_debug {
>>> file "/var/log/named.debug.log" versions 2 size 10m;
>>> print-time yes;
>>> print-category yes;
>>> print-severity yes;
>>> severity dynamic;
>>>         };
>>>
>>> // Send the security related messages to a separate file.
>>> channel audit_log {
>>> file "/var/log/named.audit.log" versions 4 size 10m;
>>> severity info;
>>> print-time yes;
>>> print-category yes;
>>> };
>>>
>>>
>>>         category queries { query_info; };
>>>         category default { b_debug; };
>>>         category config { b_debug; };
>>>         category security { security_file; };
>>> // category lame-servers { audit_log; };
>>> category lame-servers { null; };
>>>
>>> };
>>>
>>> zone "." IN {
>>> type hint;
>>> file "/var/named/named.ca";
>>> };
>>>
>>> zone "localhost.localdomain" IN {
>>> type master;
>>> file "named.localhost";
>>> allow-update { none; };
>>> };
>>>
>>> zone "localhost" IN {
>>> type master;
>>> file "named.localhost";
>>> allow-update { none; };
>>> };
>>>
>>> zone "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa"
>>> IN {
>>> type master;
>>> file "named.loopback";
>>> allow-update { none; };
>>> };
>>>
>>> zone "1.0.0.127.in-addr.arpa" IN {
>>> type master;
>>> file "named.loopback";
>>> allow-update { none; };
>>> };
>>>
>>> zone "0.in-addr.arpa" IN {
>>> type master;
>>> file "named.empty";
>>> allow-update { none; };
>>> };
>>>
>>> include "/etc/named.root.key";
>>> include "/etc/rndc.key";
>>> _______________________________________________
>> _______________________________________________
>> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list
>>
>> bind-users mailing list
>> bind-users at lists.isc.org
>> https://lists.isc.org/mailman/listinfo/bind-users


More information about the bind-users mailing list