"*.dlv.isc.org DS: must be secure" warnings [was: Re: 9.6.1-P1 log message]
Mark Andrews
marka at isc.org
Mon Sep 28 00:16:51 UTC 2009
In message <Prayer.1.3.2.0909262248400.24454 at hermes-1.csi.cam.ac.uk>, Chris Tho
mpson writes:
> Back in August there was some a thread on bind-users about messages
> of the shape
>
> validating @[hex]: [name].dlv.isc.org DS: must be secure failure
>
> (these are category "dnssec" severity "warning") and on 31 August I wrote:
>
> >We have been running two production recursive nameservers validating against
> >dlv.isc.org since 9 June, and first saw a batch of messages (for both server
> s)
> >like this on 20 July. We reported them to ISC and got suggestions along the
> >lines of Mark's above, along with an admission that current versions of BIND
> >give up on EDNS too easily in situations they maybe shouldn't, which may be
> >fixed in future releases.
> >
> >Since then we have had a trickle of such warning messages in the logs. We
> >assume that they are the result of temporary network glitches somewhere,
> >but their frequency appears to be increasing, which is somewhat worrying.
> >It's also not clear whether any client queries are actually failing as a
> >result, or whether BIND is simply trying another dlv.isc.org nameserver
> >with better luck.
>
> I have been looking at this again, and in fact there was a step function
> on 21 August when the messages rose from almost nil to 15-20 per day, and
> then fell back to almost nil after 15 September (we've seen just one since
> then). We have been running BIND 9.6.1-P1 throughout.
>
> I would be very interested to know whether other recursive nameserver
> operators validating via dlv.isc.org have seen a similar pattern. I am
> prepared to believe that the frequency is related to transient network
> errors or delays, but I have no idea whether they are likely to be local
> or at at the dlv.isc.org server end.
One gets these or similar messages when named falls back to plain
DNS as a result of multiple timeouts. Named tries EDNS advertising
a 4096 byte UDP buffer, then after multiple timeouts it tries EDNS
advertising a 512 byte UDP buffer, then after multiple timeouts it
tries plain DNS.
Named also had a bug where it would fallback a EDNS step when it
didn't need to (like retrying w/ TCP). This made DNSSEC behind
middleware that was dropping fragments difficult.
2564. [bug] Only take EDNS fallback steps when processing timeouts.
[RT #19405]
Some (perhaps not all) of the timeout causes are below. This list is
not specific to DLV.
(apparent) non responses to UDP queries can be due to lots of causes:
*+ Firewalls/middleware that blocks DNS responses > 512
*+ Firewalls/middleware that blocks fragments
*+ Lack of support for out of order responses in NAT
*+ Responses that require fragmentation but DF set. Most of these will
be in the 1481-1500 bytes in size (IP in IP tunnels). Larger responses
are usually fragmented by the sending OS and don't have DF set. Smaller
response make it through a single layer of encapsulation.
*+# Bad nameserver software that fails to respond to EDNS requests
*+# Firewalls/proxies that block EDNS queries or queries/responses with
one or more of DO, CD or AD set.
* Congestion
* Packet corruption
* Appear lost due to long rtt times
- load balancing probes taking too long
- multiple satellite links
- significant congestion causing long delays
+ indicates broken software
# indicates fallback to plain DNS will be required
A handful a day would suggest packet corruption/congestion as the likely
cause.
Mark
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: marka at isc.org
More information about the bind-users
mailing list