libreoslv API quesiton: Using RES_SECV and RES_STAYOPEN with TCP

Sat Apr 14 00:15:31 UTC 2007

Balasubramanyam, Shivakumar wrote:
> Hi,
>
> I have a problem using the client libresolv api for the scenarion below.
>
>
> I have two nameservers A and B in that order in /etc/resolv.conf and
> issues are being found if named is not running in machine nameserver A.
>
> Using UDP:
> 1. The queries always go to nameserver A, fails and then succeeds with
> B. How do you specify to API to continue with B as long as queries are
> successful?
>   
You don't. Apps shouldn't be choosing resolvers. That's a system 
function. How do you expect your network/system administrators to 
troubleshoot problems when you're ignoring their carefully-planned 
configuration, and going off and doing your own thing? Even more than 
that, by putting resolver-selection logic in your app, you're taking the 
first steps towards re-inventing a very old wheel. There's almost 
certainly (unless this is some sort of weird embedded device) already 
something available for your server that allows it to intelligently and 
adaptively choose where to fetch DNS information. It's called a caching 
resolver. Set one up on your box with a simple config, point 
/etc/resolv.conf to 127.0.0.1 or a local physical interface address, and 
you'll see an increase in availability (at very little resource cost, 
but maybe a little bit of support/maintenance overhead). You'll also 
benefit from caching when set up that way, if you aren't already 
(through nscd or whatever).

Failing that, what you *may* be able to do, depending on your platform's 
capabilities, is tune the failover timeout in /etc/resolv.conf so that 
even if A is unavailable it won't have a significant impact on the 
application. Check your system's documentation for resolv.conf, and see 
if you can find "timeout"/"retrans" and/or "retry"/"attempts" options 
which can be set. Just be aware that in the (hopefully unlikely) event 
that *all* of the resolvers are down temporarily, having a really short 
failover timeout may cause the lookup to fail completely (sometimes the 
invoking app has its own lookup retry logic built in, so that may not be 
a critical concern).

If you need an even higher level of availability, then look into 
clustering and/or load-balancing and/or anycast solutions for your 
nameservers, so that "A" and "B" would each actually be multiple boxes, 
possibly located in different parts of the network, and backing each 
other up.

(Can't really comment on your other questions, since my experience with 
the resolver library is somewhat limited).

- Kevin