Tuning suggestions for high-core-count Linux servers

Thu Jun 1 11:16:44 UTC 2017

  Hello Stuart,
a few simple ideas to your tests:
 - have you inspected the per-thread CPU? Aren't some of the threads overloaded?
 - have you tried to get the statistics from the Bind server using the
 XML or JSON interface? It may bring you another insight to the errors.
 - I may have missed the connection count you use for testing - can you
 post it? More, how may entries do you have in your database? Can you
 share your named.conf (without any compromising entries)?
 - what is your network environment? How many switches/routers are there
 between your simulator and the Bind server host?
 - is Bind the only running process on the tested server?
 - what CPUs is the Bind server being run on?
 - is there numad running and while trying the taskset, have you
 selected the CPUs on the same processor? What does numastat show during
 the test?
 - how many UDP sockets are in use during your test?

Curious for the responses.

  Lukas

Browne, Stuart <Stuart.Browne at neustar.biz> writes:

> Cheers Matthew.
>
> 1)  Not seeing that error, seeing this one instead:
>
> 01-Jun-2017 01:46:27.952 client: warning: client 192.168.0.23#38125 (x41fe848-f3d1-4eec-967e-039d075ee864.perf1000): error sending response: would block
>
> Only seeing a few of them per run (out of ~70 million requests).
>
> Whilst I can see where this is raised in the BIND code (lib/isc/unix/socket.c in doio_send), I don't understand the underlying reason for it being set (errno == EWOULDBLOCK || errno == EAGAIN).
>
> I've not bumped wmem/rmem up as much as the link (only to 16MB, not 40MB), but no real difference after tweaks. I did another run with stupidly-large core.{rmem,wmem}_{max,default} (64MB), this actually degraded performance a bit so over tuning isn't good either. Need to figure out a good balance here.
>
> I'd love to figure out what the math here should be.  'X number of simultaneous connections multiplied by Y socket memory size = rmem' or some such.
>
> 2) I am still seeing some udp receive errors and receive buffer errors; about 1.3% of received packets.
>
> From a 'netstat' point of view, I see:
>
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address           Foreign Address         State
> udp   382976  17664 192.168.1.21:53         0.0.0.0:*
>
> The numbers in the receive queue stay in the 200-300k range whilst the send-queue floats around the 20-40k range. wmem already bumped.
>
> 3) Huh, didn't know about this one. Bumped up the backlog, small increase in throughput for my tests. Still need to figure out how to read sofnet_stat. More google-fu in my future.
>
> After a reboot and the wmem/rmem/backlog increases, no longer any non-zero in the 2nd column.
>
> 4) Yes, max_dgram_qlen is already set to 512.
>
> 5) Oo! new tool! :)
>
> --
> ...
> 11 drops at location 0xffffffff815df171
> 854 drops at location 0xffffffff815e1c64
> 12 drops at location 0xffffffff815df171
> 822 drops at location 0xffffffff815e1c64
> ...