Trying to get full domain info in nslookup

Kevin Darcy kcd at daimlerchrysler.com
Wed Sep 28 02:17:49 UTC 2005


Mark Andrews wrote:

>>Mark Andrews wrote:
>>
>>    
>>
>>>>QTYPE=* (otherwise known as "any") queries are treated by BIND as 
>>>>non-recursive-when-something-is-cached-for-the-name-recursive-otherwise 
>>>>because of a misreading of RFC 1034 that has never been corrected.
>>>>   
>>>>
>>>>        
>>>>
>>>	In your opinion.  Please re-read Section 6.2.2.   It clearly
>>>	show the caching servers returning subsets of records.
>>>
>>>      
>>>
>>In response to a *non-recursive* query, sure. All of the example queries 
>>in Section 6.2.2 are RD=0 unless otherwise noted (see the intro 
>>paragraph at 6.2). Nowhere in 1034/1035 is it permitted to treat an RD=1 
>>query as RD=0 and yet return the response as RA=1, which is what BIND 
>>does.  That's just fibbing. Of course, BIND or any DNS implementation for 
>>that matter, can decline to recurse a query but a) this decision should 
>>IMO be policy-driven, not hardcoded for QTYPE=* queries, and b) the 
>>responding server shouldn't *lie* about whether it is honoring recursion 
>>or not. Don't you think it kind of defeats the whole purpose of the RA 
>>bit if responders can set it any way they want, for any arbitrary reason?
>>    
>>
>
>	The nameserver is not required to recurse if it can answer
>	from the cache.  "*" is clearly NOT required to return all
>	possible records.
>
>3.7.1. Standard queries
>
>*               matches all RR types.
>
>	"*" matches NOT returns all RR types.  When you have a incomplete
>	cache "*" matches all records there.
>
That's highly arguable. * "matches" all RR types, but what does it 
"return"? You say, "all records there" (i.e. in the cache), I say "all 
records available", which could, in addition to cached records, mean 
records obtainable via recursion, if recursion is requested and honored. 
This section of the RFC is not clear on this point; it only talks about 
"matching", it doesn't say anything about "returning". In contrast, in 
the section describing recursive service generally (4.3.1), it says

>If recursive service is requested and available, the recursive response
>to a query will be one of the following:
>
>   - The answer to the query, possibly preface by one or more CNAME
>     RRs that specify aliases encountered on the way to an answer.
>
>   - A name error indicating that the name does not exist.  This
>     may include CNAME RRs that indicate that the original query
>     name was an alias for a name which does not exist.
>
>   - A temporary error indication.
>
Note: "the answer to the query". Not "as much of the answer as I feel 
like returning from my cache, if and only if I have a partial answer in 
my cache, otherwise the answer as I obtained it via recursion". Can you 
in good conscience say that when recursion is in effect and a resolver 
returns a *partial* answer to a query, even though the full answer is 
available via recursion, and even though the full answer *would* have 
been returned if the cache contents were slightly different, that it has 
really fulfilled its duty and returned "the answer to the query"? I 
would certainly dispute that interpretation. The obvious intent is that 
when recursion is in effect, the resolver attempts, within _reasonable_ 
limits, to get a proper answer back to the client. Those _reasonable_ 
limits should be both reasonably low (can't go on retrying and 
retransmitting forever), but they should also be reasonably high (which 
may mean recursing sometimes, even if we don't feel like doing so). I 
don't see returning "whatever I happen to have available without 
recursing" as sufficiently diligent. If that's what the client wanted 
and/or expected, it would have made a non-recursive query. What you're 
basically saying is that the client wasted its time setting the RD bit, 
and I can't accept that.

>
>5.3.3. Algorithm
>
>The top level algorithm has four steps:
>
>   1. See if the answer is in local information, and if so return
>      it to the client.
>
>	....
>
>Step 1 searches the cache for the desired data. If the data is in the
>cache, it is assumed to be good enough for normal use.  
>
Again, the pertinent inquiry is: what is "the answer" to a QTYPE=* 
query? To be sure, BIND's current behavior gives *an* answer to the 
query, which violates no fundamental rules of DNS (such a violation of 
fundamental rules would be, e.g. if it answered a question that wasn't 
asked, or something like that), but is that type of answer *the* answer 
to the query, when it's only a partial answer, and the full answer is 
available, and recursion is in effect? I say no.

>>>	One could argue that named shouldn't even recurse in a
>>>	attempt to get some sort of a answer but then you would not
>>>	be able to determine if NXDOMAIN should be returned or not.
>>>
>>> 
>>>
>>>      
>>>
>>>>So if 
>>>>something happens to be cached for the name you're querying in the 
>>>>nameserver which is responding to the query, you get the cached data, 
>>>>otherwise it goes out and fetches a new set of Resource Records. To run 
>>>>a proper test, you'd need to clear the cache (i.e. restart the 
>>>>nameserver process) between each set of queries. Then you'd see that 
>>>>each response to a QTYPE=* query consists of only those RRs with the 
>>>>name immunetolerance.org that were cached from the responses to the 
>>>>previous immunetolerance.org queries (assuming that the cache wasn't 
>>>>being populated with immunetolerance.org Resource Records by anything else)
>>>>        
>>>>
>>.
>>    
>>
>>>>Bottom line, BIND has made QTYPE=* a lot less useful than it could be or 
>>>>was originally intended to be. I think there isn't a lot of incentive to 
>>>>fix this, though, because to fix it raises the possibility that apps 
>>>>could start using QTYPE=* inappropriately, thus causing wasted 
>>>>resources. That's a FUD argument, though, and should not IMO stand in 
>>>>the way of a proper implementation.
>>>>   
>>>>
>>>>        
>>>>
>>>	To turn "*" into ALL the cache the cache would have to make
>>>	a "*" query for every query it made or remember it had made
>>>	a "*" query and clear that state whenever it expired a RRset.
>>>
>>>      
>>>
>>The former option is probably preferable rom a code-simplicity 
>>standpoint, the latter from a performance/resource-consumption 
>>standpoint. Note that if only *one* RRset has expired from the cache, it 
>>should only be necessary to recurse for a fresh version of just that one 
>>RRset, not necessarily the full nameset (of course, this is subject to 
>>local and/or adaptive optimization; it might make sense to do the 
>>QTYPE=* query anyway, because the additional overhead is relatively low 
>>compared to the benefits of "freshening" all of the other cached RRsets 
>>under the name).
>>
>>    
>>
>>>	You then also have to deal with the fact that "*" queries are 
>>>	more likely to exceed the various DNS buffer sizes causing
>>>	fallover to TCP initially and then truncated TCP responses
>>>	(authoritative servers) or failures from caches as they can't
>>>	get a non truncated response.
>>>
>>>      
>>>
>>1. We're dealing with the buffer-size/truncation/TCP-retry issue anyway 
>>for DNSSEC
>>2. If the app can make do with an non-deterministic subset of the data, 
>>it always has the option of issuing a non-recursive query (like the ones 
>>exemplified in Section 6.2.2). If it absolutely *must* have a full set 
>>of data, and can't get it because it's too big, and doesn't have a 
>>fallback option of issuing multiple type-specific queries, then that's 
>>the fault of the app and/or the respetive domain owner(s). We shouldn't 
>>be trying to save those folks from themselves. They'll eventually learn 
>>the error of their ways and either change the app and/or trim down the 
>>namesets to a reasonable size.
>>    
>>
>
>	I've yet to see a application (other than a zone/dns maintenance
>	application) that needs to see all the record types at a node.
>
Self-fulfilling prophecy. BIND -- and the other DNS implementations that 
have followed BIND's lead in this regard -- has made QTYPE=* so useless 
that most apps stay away from it. Doesn't mean that it wouldn't or 
couldn't be a useful and efficient (relative to multiple type-specific 
queries) querying mechanism.

>P.S. I'm surprised you didn't bring up the negative-caching implications.
>  
>
>
>	What negative cache implications?  Either there node is empty,
>	does not exist or has data.  It takes as much space to store that
>	a node is empty as it takes to remember that a not doesn't exist
>	or that is doesn't have a particuler RR type.
>
What I'm getting at is, if a resolver caches the results of a QTYPE=* 
query, and, before any of those RRsets have expired, is queried for an 
RRset under that name which *wasn't* in the response, should it be 
entitled to answer NODATA on its own initiative? After all, it has a 
"full" nameset cached, so it "knows" that the RRset doesn't exist, 
right? Or, does it *always* have to go back to an authoritative server 
in order to send back the NODATA? If the resolver can answer NODATA 
based purely on cached data, must it then insulate the RRsets of that 
"atomic" QTYPE=* response from being "freshened" by the results of 
subsequent type-specific queries? Without such insulation, the RRsets 
could get freshened indefinitely, and the resolver will still keep on 
returning NODATA for any other RRset under the name, even though new 
RRsets may have popped up on the authoritative servers ages ago.

In previous discussions, I've proposed that authoritative servers 
include a "generic" negative caching record in all responses to QTYPE=* 
queries, which would be the minimized aggregate of all NODATA responses 
to all queries of the name. A caching resolver can then answer with 
NODATA confidently as long as that "generic" negative caching record is 
in effect. Once that negative caching record expires, then the caching 
resolver will need to once again talk to the authoritative servers if it 
gets a query for one of the RRsets outside of those it already has 
cached. Upon reflection, I think the inclusion of such a "generic" 
negative caching record, to a response with an otherwise-complete Answer 
Section, would probably be too dangerous to the installed codebase, so 
it's something that should only be used by mutual agreement (e.g. 
through EDNS options). I haven't completely given up on the idea though.

                                                                         
                                    - Kevin




More information about the bind-users mailing list