Tuning suggestions for high-core-count Linux servers
Browne, Stuart
Stuart.Browne at neustar.biz
Fri Jun 2 07:12:09 UTC 2017
Just some interesting investigation results. One of the URL's Matthew Ian Eis linked to talked about using a tool called 'perf'. For the hell of it, I gave it a shot.
Sure enough it tells some very interesting things.
When BIND was restricted to using a single NUMA node, the biggest call (to _raw_spin_lock) showed 7.05% overhead.
When BIND was allowed to use both NUMA nodes, the same call showed 49.74% overhead; an astonishing difference.
As it was running unrestricted, memory from both nodes was more used:
[root at kr20s2601 ~]# numastat -p 22441
Per-node process memory usage (in MBs) for PID 22441 (named)
Node 0 Node 1 Total
--------------- --------------- ---------------
Huge 0.00 0.00 0.00
Heap 0.45 0.12 0.57
Stack 0.71 0.64 1.35
Private 5.28 9415.30 9420.57
---------------- --------------- --------------- ---------------
Total 6.43 9416.07 9422.50
Given the numbers here, you wouldn't think it should make much of a difference.
Sadly, I didn't get which CPU the UDP listener was attached to.
Anyway, what I've changed so far:
vm.swappines = 0
vm.dirty_ratio = 1
vm.dirty_background_ratio = 1
kernel.sched_min_granularity_ns = 10000000
kernel.sched_migration_cost_ns = 5000000
Query rate thus far reached (on 24 cores, numa node restricted): 426k qps
Query rate thus far reached (on 48 cores, numa nodes unrestricted): 321k qps
Stuart
'perf' data collected during a 3 minute test run:
[root at kr20s2601 ~]# ls -al perf.data*
-rw-------. 1 root root 717350012 Jun 2 08:36 perf.data.24
-rw-------. 1 root root 1366620296 Jun 2 08:53 perf.data.48
'perf' top 5 (24 cores, numa restricted):
Overhead Command Shared Object Symbol
7.05% named [kernel.kallsyms] [k] _raw_spin_lock
6.96% named libpthread-2.17.so [.] pthread_mutex_lock
3.84% named libc-2.17.so [.] vfprintf
2.36% named libdns.so.165.0.7 [.] dns_name_fullcompare
2.02% named libisc.so.160.1.2 [.] isc_log_wouldlog
'perf' top 5 (48 cores):
Overhead Command Shared Object Symbol
49.74% named [kernel.kallsyms] [k] _raw_spin_lock
4.52% named libpthread-2.17.so [.] pthread_mutex_lock
3.09% named libisc.so.160.1.2 [.] isc_log_wouldlog
1.84% named [kernel.kallsyms] [k] _raw_spin_lock_bh
1.56% named libc-2.17.so [.] vfprintf
More information about the bind-users
mailing list