Bind vs flood

Thu Feb 27 15:12:37 UTC 2014

Hi Dmitry,

We observed that similar requests are landing on our cache resolver
mostly from various home routers running dns server as open resolver and
that also masquerades the original request source.
We have a collection of ~60 domains involved and most of them are
related to China. The problem is that attacker selects few domains and
generates queries with random hostnames which therefore are not in the
cache and server has to perform recursion for each query. So each query
will consume one udp or tcp socket for at least 10 seconds because
remote DNS server is responding slowly or is down and based on a query
volume it can effectively overload the cache server.

Initially we thought we could fix it with " resolver-query-timeout", but
after bind code analysis it seems that everything less that 10 seconds
would be ignored, it would be great to mention this in the documentation.
So one solution is to change MINIMUM_QUERY_TIMEOUT in resolver.c and
recompile named, but  it would be nice to understand why 10 seconds as
minimum value were selected in the first place, see /lib/dns/resolver.c

#define MAX_SINGLE_QUERY_TIMEOUT 9U
#define MINIMUM_QUERY_TIMEOUT (MAX_SINGLE_QUERY_TIMEOUT + 1U) 

....snip....

void
dns_resolver_settimeout(dns_resolver_t *resolver, unsigned int seconds) {
        REQUIRE(VALID_RESOLVER(resolver));
        if (seconds == 0)
                seconds = DEFAULT_QUERY_TIMEOUT;
        if (seconds > MAXIMUM_QUERY_TIMEOUT)
                seconds = MAXIMUM_QUERY_TIMEOUT;
        if (seconds < MINIMUM_QUERY_TIMEOUT)
                seconds =  MINIMUM_QUERY_TIMEOUT;
        resolver->query_timeout = seconds;
}

We also tried to create local dummy zones for all these domains but
since domains change frequently we started to block most active open
resolvers and coordinate with local CERT.

It would be nice to have some kind of rate limits for query volume of
different hosts inside a single zone.

Best regards,

Ivo

On 2/27/14 7:59 AM, Dmitry Rybin wrote:
> Over 2 weeks ago begins flood. A lot of queries:
>
> niqcs.www.84822258.com
> vbhea.www.84822258.com
> abpqeftuijklm.www.84822258.com
> adcbefmzidmx.www.84822258.com
> and many others.
>
> Bind answers with "Server failure". On high load (4 qps) all normal
> client can get Servfail on good query. Or query can execute more 2-3
> second.
>
> Recursion clients via "rnds status" 300-500.
>
> I can try to use rate limit:
>         rate-limit {
>                 nxdomains-per-second 10;
>                 errors-per-second 10;
>                 nodata-per-second 10;
>         };
> I do not see an any improvement.
>
> Found one exit in this situation, add flood zones local.
>
> What can we do in this situation?
> _______________________________________________
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
>
> bind-users mailing list
> bind-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20140227/c9caf224/attachment.html>