[SOLVED] Re: BIND9 SERVFAIL on some .gov addresses

Ryan Novosielski novosirj at umdnj.edu
Fri Feb 11 23:50:16 UTC 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/11/2011 01:21 PM, Ryan Novosielski wrote:
> On 02/10/2011 04:19 PM, Chuck Swiger wrote:
>> On Feb 10, 2011, at 12:39 PM, Ryan Novosielski wrote:
>>> health.nyc.gov query-errors:
>>>
>>> 10-Feb-2011 15:32:30.682 query-errors: debug 1: client
>>> 130.219.34.129#55935: query failed (SERVFAIL) for health.nyc.gov/IN/MX
>>> at query.c:4630
>>> 10-Feb-2011 15:32:30.682 query-errors: debug 2: fetch completed at
>>> resolver.c:3057 for health.nyc.gov/MX in 0.000046: failure/success
>>> [domain:nyc.GOV,referral:0,restart:1,qrysent:0,timeout:0,lame:0,neterr:0,badresp:0,adberr:4,findfail:0,valfail:0
> 
>> The adberr count looks like it can only be incremented by two code sections in lib/dns/resolver.c:
> 
>>         if (result != ISC_R_SUCCESS) {
>>                 if (result == DNS_R_ALIAS) {
>>                         /*
>>                          * XXXRTH  Follow the CNAME/DNAME chain?
>>                          */
>>                         dns_adb_destroyfind(&find);
>>                         fctx->adberr++;
>>                 }
>>         }
> 
>> [ ...and... ]
> 
>>                         if ((find->options & DNS_ADBFIND_LAMEPRUNED) != 0)
>>                                 fctx->lamecount++; /* cached lame server */
>>                         else
>>                                 fctx->adberr++; /* unreachable server, etc. */
> 
>> This implies a connectivity issue between your client and the nyc.gov nameservers, I think.
>> But there are local wizards lurking who are much more familiar with the code than I....
> 
> It is starting to appear as if this is an issue relating to EDNS, though
> I can't see specifically how. It does not appear to even be a size
> related issue, but instead possibly something to do with packet
> fragmentation. I built a BIND 9.6.2 server on a CentOS VM -- works fine
> off our network (connected via Verizon Wireless), but does not work on
> campus.
> 
> What I don't quite understand is why querying say 8.8.8.8 with a copy of
> dig on our network would work. Isn't the same thing ultimately going to
> have to pass through the same place in our firewall/network eventually
> whether it's a nameserver asking for it or a client?

So it was a two part problem, one that pertains to BIND and one that
pertains to the firewall.

1) I had max-udp-size=512, which is what I understood to be the prudent
thing to have configured if your firewall had a DNS packet limit of 512.
For whatever reason, that turned out not to be correct.

2) In the firewall we had a packet size limit of 512 for non-EDNS
traffic and "client auto" for EDNS traffic. However, in our version of
firewall firmware, this does not work (a bug), so all of our traffic was
effectively limited to 512.

What I haven't yet figured out is why #1 would cause the connectivity
problem that it did to the .gov DNS servers. It appears that perhaps
something was destroying the fragmented packets. I'd be curious if
there's someone out there who knows more than me and could help explain.

- -- 
- ---- _  _ _  _ ___  _  _  _
|Y#| |  | |\/| |  \ |\ |  | |Ryan Novosielski - Sr. Systems Programmer
|$&| |__| |  | |__/ | \| _| |novosirj at umdnj.edu - 973/972.0922 (2-0922)
\__/ Univ. of Med. and Dent.|IST/CST-Academic Svcs. - ADMC 450, Newark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1VyzgACgkQmb+gadEcsb4jDQCfUM3JoQNNg8kluYVaM7n4o/l0
W6MAoMzkyoKjJZntBUlvO0iLkjPkfq0l
=/R/g
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: novosirj.vcf
Type: text/x-vcard
Size: 301 bytes
Desc: not available
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20110211/34d8914c/attachment.vcf>


More information about the bind-users mailing list