strange problem with query being dropped/ignored by the BIND process

Marc Richter marc.richter at de.verizon.com
Thu Jun 29 12:52:00 UTC 2017


Hi again,

I have checked this again today.

Send & receive buffers are both 1MB, the Server has 8 CPUs and during
startup BIND is reporting this:

	found 8 CPUs, using 8 worker threads
	using 7 UDP listeners per interface
	using up to 32768 sockets

We only have about 1.500 queries per second on this server. CPU(30%) and
memory(50%) usage also is not an issue here.

Now Oracle support is saying that the buffer sizes are fine and we need to
"speed up the application" to read the data faster from the receive buffer
and this prevent package drops.

Do you think that is a reasonable statement in this environment ?
What would be the best way to "speed up the application" ? Just increase
the worker threads ?

Regards
Marc


On 06/28/17 15:31, Marc Richter wrote:
> Hi Ben,
> 
> thanks for the answer.
> 
> Yeah, I think you are right. I see a lot of udpInOverflows on the system,
> which suggest that the receive buffer is too small indeed.
> 
> Is there any kind of recommendation or best-practice advice what the
> buffers should ideally be set to on Solaris ?
> I did search the ISC Knowledge Base, but didn't find any useful advice.
> 
> Regards
> arc
> 
> On 06/28/17 14:37, Ben Croswell wrote:
>> Have you checked deeper at the OS level? I have seen on Linux DNS servers
>> silent drops of queries on very busy servers that were exhausting UDP
>> receive buffers.
>>
>> On Jun 28, 2017 10:26 AM, "Marc Richter" <marc.richter at de.verizon.com
>> <mailto:marc.richter at de.verizon.com>> wrote:
>>
>>     Hi,
>>
>>     we have a setup here consisting of a recursive DNS server and two
>>     monitoring servers. The monitoring servers sent a test query to the DNS
>>     server once every two minutes to check if it is answering properly.
>>
>>     We now have the problems that these test queries are timing out from time
>>     to time, (correctly) resulting in alarms in our monitoring system.
>>
>>     I have checked this now and noticed that each time we see that alarm, the
>>     query sent by the monitoring server is not being answered at all.
>>     To debug that I ran tcpdump on both the monitoring server and the recursive
>>     DNS server. I see the query being sent out on the monitoring server and I
>>     also see the query being received on the DNS server, however there is no
>>     response sent to this query at all.
>>     Looking at the query log, which I enabled temporarily, the query is also
>>     not logged there so it looks like BIND is ignoring that query somewhere,
>>     although it is properly received by the IP stack of the server.
>>
>>     Do you have any suggestions how to debug this further, to hopefully find
>>     out where these queries are stuck/dropped/ignored, as I have run out of
>>     ideas ?
>>
>>     The environment is:
>>     BIND 9.9.9-P5 (Extended Support Version) <id:1ab232a>
>>     running on SunOS sun4v 5.11 11.3
>>
>>
>>     Thanks !
>>     Marc
>>     _______________________________________________
>>     Please visit https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.isc.org_mailman_listinfo_bind-2Dusers&d=DwICAg&c=udBTRvFvXC5Dhqg7UHpJlPps3mZ3LRxpb6__0PomBTQ&r=wDgZv-d1RrBMzWr_7pSF_09ZAXIr59EgoXQU4ctOHMk&m=b8p_t6atDvFHu2tWe4Jgw_EvLufZakMUJL0w06aA3V0&s=bXYnQq1IzLGZG6xbey81qsaTVpqiLVlwxazV8CXVP_A&e= 
>>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.isc.org_mailman_listinfo_bind-2Dusers&d=DwMFaQ&c=udBTRvFvXC5Dhqg7UHpJlPps3mZ3LRxpb6__0PomBTQ&r=wDgZv-d1RrBMzWr_7pSF_09ZAXIr59EgoXQU4ctOHMk&m=t6jk-SZ5v_kNlupaNbpfob7Dm6Iddy_gUndDBwWnkmc&s=Ko40xVILMIdx3tQ9ElkdPqboTH8RpH1ZKJ4ZXcGp9NM&e=>
>>     to unsubscribe from this list
>>
>>     bind-users mailing list
>>     bind-users at lists.isc.org <mailto:bind-users at lists.isc.org>
>>     https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.isc.org_mailman_listinfo_bind-2Dusers&d=DwICAg&c=udBTRvFvXC5Dhqg7UHpJlPps3mZ3LRxpb6__0PomBTQ&r=wDgZv-d1RrBMzWr_7pSF_09ZAXIr59EgoXQU4ctOHMk&m=b8p_t6atDvFHu2tWe4Jgw_EvLufZakMUJL0w06aA3V0&s=bXYnQq1IzLGZG6xbey81qsaTVpqiLVlwxazV8CXVP_A&e= 
>>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.isc.org_mailman_listinfo_bind-2Dusers&d=DwMFaQ&c=udBTRvFvXC5Dhqg7UHpJlPps3mZ3LRxpb6__0PomBTQ&r=wDgZv-d1RrBMzWr_7pSF_09ZAXIr59EgoXQU4ctOHMk&m=t6jk-SZ5v_kNlupaNbpfob7Dm6Iddy_gUndDBwWnkmc&s=Ko40xVILMIdx3tQ9ElkdPqboTH8RpH1ZKJ4ZXcGp9NM&e=>
>>
>>
> 

-- 
Marc Richter
Engr III Cslt-Ntwk Eng&Ops

Sebrathweg 20
44149 Dortmund
Germany

O +49 231 972 1293
F +49 231 972 2587
E marc.richter at de.verizon.com


More information about the bind-users mailing list