Problem in Performance test

Fri Jan 15 13:56:40 UTC 2016

------- Original Msg -----
From: "RunxiaWan" <wanrunxia at aliyun.com>
Subject: Problem in Performance test

Hi all,
I am doing performance test for my company's resolver with BIND 9.10.3 and
find something weird. The test client and resolver are in the same LAN. When
I use a small set of domain as an input with a 10000 per second query
sending rate, everything looks reasonable. However, when I use a set of
thousands of domains as an input, The QPS is unexpectedly low and the
latency is high. Here is the result from DNSperf:

  Queries sent:         11823
  Queries completed:    11823 (100.00%)
  Queries lost:         0 (0.00%)

  Response codes:       NOERROR 9883 (83.59%), SERVFAIL 242 (2.05%),
NXDOMAIN 1698 (14.36%)
  Average packet size:  request 48, response 203
  Run time (s):         69.891567
  Queries per second:   169.162039

  Average Latency (s):  0.519502 (min 0.003766, max 211.981919)
  Latency StdDev (s):   1.423057

And when I decreased the query sending rate to 100 per second, the latency
decrease as the same when I use small set of domain as an input. Here is the
result from DNSperf:

	Statistics:

  Queries sent:         6000
  Queries completed:    6000 (100.00%)
  Queries lost:         0 (0.00%)

  Response codes:       NOERROR 4995 (83.25%), SERVFAIL 37 (0.62%), NXDOMAIN
968 (16.13%)
  Average packet size:  request 54, response 211
  Run time (s):         62.789257
  Queries per second:   95.557748

  Average Latency (s):  0.083028 (min 0.005266, max 134.543920)
  Latency StdDev (s):   0.568863

Anyone knows any explanation for this? Thanks.
---------------
Runxia Wan(Brian)
Research Engineer
BII Lab
Beijing Internet Institute(BII)
rxwan at biigroup.cn
---------------------------------------

Few  things are affecting the resolver test:
1) A resolver is allowed to cache answers, in your small test of zones  it may have done a series of iterative queries to find the answers, but once it had the answers, future queries were answered from cache -- easy going and high QPS, just limited by your memory/cpu performance.

2) When you added thousands of random sites it took it much longer to perform all the look ups and begin storing cached results.  A much more realistic test, but slower going.

3) Not sure if you stopped/started named after each test?  If you just left it running, it still had many results cached and could use those in subsequent tests.

4) UDP queue length - a realistic test would have several clients running dnsperf and querying the same resolver.  As the resolver falls behind, the UDP queue of waiting requests grows and some of those are just dropped by the OS.

There's a lot going on under the hood!
Hope this helps!
John

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20160115/f9cc36a7/attachment-0001.html>

------------------------------

_______________________________________________
bind-users mailing list
bind-users at lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

End of bind-users Digest, Vol 2288, Issue 1
*******************************************