dns cache issue

Edwardo Garcia wdgarc88 at gmail.com
Sat Jan 12 02:48:22 UTC 2019


OK, so  this happen again, with link congestion.

bind is caching the results as tested with no congestion, 78ms down to
1ms... BUT the issue with bind remain and logs show nothing wrong

congested link lookup , tried in instant succession with a second or less
between:
google.com (like any other host I try)  timeout no servers can be reached
lookup internal zone I added to bind, replies with 7ms
retry google and few other sites again, all timeout no servers can be
reached
(google may only have 5min TTL but other domains i'm testing, including
mail provider etc, is 1 day.
ping to DNS box is quick
ping to other boxes is quick too
disconnect  windows updating pc, and google et al respond with 1ms so it
obviously is in the bloody cache but because bind  cant do something with
internet in a timely manor it just spits dummy

Why bind do this if it should already know the answer, it should give
answer, since it holds the record, just as it knows the internal test zone.

this all cause mail to fail, web browsing to fail, boss not happy.



On Fri, Jan 11, 2019 at 9:27 AM Edwardo Garcia <wdgarc88 at gmail.com> wrote:

> Kevin,
> I though lan saturation too, but I can ssh into bind server immediately, I
> also, from my other pc did a lookup on local authoritative zone rpz.lan, so
> my bind replying right away or within 1 second during congestion, could it
> be dnssec the problem, I did not disable that to test, it really is like it
> is not caching any external results so maybe it needs to go out and do all
> lookups again to make sure signature valid? I really don't know. I'm now
> guessing.
>
> I will try your suggestion of logging again, and as for link local, yes,
> couple of years ago  we saw problems
>
> ed
>
> On Fri, Jan 11, 2019 at 1:17 AM Kevin Darcy <kevin.darcy at fcagroup.com>
> wrote:
>
>> Offhand, sounds like your LAN is saturated so the queries might not be
>> getting to BIND in the first place. Or the replies aren't getting back.
>> It's unlikely that QoS is going to help this, you indicated that QoS was on
>> your "router", and that is typical -- usually QoS is found on WAN links.
>> (Although, on the other hand, you mentioned VoIP, and VoIP sometimes
>> requires applying QoS at the LAN level too).
>>
>> You currently have query logging turned off. If it's not too
>> resource-intensive, you might want to consider turning that on, to verify
>> whether the queries are getting to BIND. Or, run a packet capture on the
>> BIND side. Packet capture on the BIND device should also help to identify
>> any issues talking upstream (e.g. to TLD servers or auth servers for
>> domains like google.com). Packet capture on the *client* side would
>> probably be necessary for definitive proof of whether replies are being
>> dropped by the LAN (compare what the server sent side-by-side with what the
>> client saw).
>>
>> I was intrigued by "server fe80::/16 { bogus yes; }; " in your config.
>> Have you had issues with IPv6 link-local addresses being associated with
>> delegated nameservers? I haven't noticed this, but then again, I haven't
>> been looking for that particular misconfiguration specifically...
>>
>>
>>                           - Kevin
>>
>>
>>
>> On Thu, Jan 10, 2019 at 12:06 AM Edwardo Garcia <wdgarc88 at gmail.com>
>> wrote:
>>
>>> With new windows update last day, we notice something strange, our local
>>> DNS cache server timeout on lookups.
>>>
>>> For example lookup google.com, 1 minute later fails timeout looking up,
>>> but since it has already looked it up it should have returned answer from
>>> cache yes? google has a 5min TTL, my cache doesnt cacher it for even  1ns
>>> it seems
>>>
>>> QoS on router gives DNS (udp and tcp)and VoIP highest priority,
>>> everything else is default QoS must be working because if I do
>>> host www.google.com $externalDNSserver   I get an answer pretty much
>>> right away,  immediately try again on our local dns server it times out
>>> cant connect to any servers.
>>> this contrinues on, if I drop the LAN port on switch the windows update
>>> machine uses,  it resolves google.com again, bring back up that port,
>>> it times out again.
>>>
>>> this only happens on congestion, with our cable link maxed out.
>>>
>>> (never thought i'd see the day when a windows pc would take out an
>>> entire network)
>>>
>>> Below is my named.conf I have to be missing something ?
>>>
>>> BIND 9.11.2-P1
>>> running on Linux i686 3.16.58 #1 SMP Sat Sep 29 11:06:24 AEST 2018
>>> built by make with defaults
>>>
>>> acl "trusted" { localhost; 198.162.100.0/24; };
>>> acl "sysop" { localhost; 192.168.100.6; };
>>>
>>> options {
>>>         directory "/var/named";
>>>         allow-query { trusted; };
>>>         allow-query-cache { trusted; };
>>>         allow-transfer { sysop; };
>>>         transfer-format many-answers;
>>>         masterfile-format text;
>>>         interface-interval 0;
>>>         response-policy {zone "rpz.lan"; };
>>>         dnssec-enable yes;
>>>         dnssec-validation auto;
>>>         empty-zones-enable yes;
>>> };
>>>
>>> server fe80::/16 { bogus yes; };
>>>
>>> logging {
>>>         category lame-servers { null; };
>>>         category edns-disabled { null; };
>>>         category client { null; };
>>>         category dnssec { null; };
>>>          //channel log_queries { file "/var/named/query.log";
>>> print-category yes; };
>>>          //category queries { log_queries; };
>>>         channel log-rpz { file "/var/log/rpz.log" versions 10 25m;
>>> severity info; };
>>>         category rpz { log-rpz; };
>>> };
>>>
>>> zone "." {
>>>         type hint;
>>>         file "root.cache";
>>>
>>> zone "rpz.lan" {
>>>         type master;
>>>         file "rpz.lan";
>>>         allow-query { trusted; };
>>>         allow-update {none;};
>>>         notify no;
>>> };
>>>
>>>
>>> zone "akamai.net" {
>>>         type forward;
>>>         forward first;
>>>         forwarders { xxxxxx; xxxxxx; };
>>> };
>>>
>>>
>>>
>>> _______________________________________________
>>> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
>>> unsubscribe from this list
>>>
>>> bind-users mailing list
>>> bind-users at lists.isc.org
>>> https://lists.isc.org/mailman/listinfo/bind-users
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20190112/afddc577/attachment.html>


More information about the bind-users mailing list