nslookup issues

Casey Deccio casey at deccio.net
Tue Sep 13 20:09:02 UTC 2022


I am trying to track down a bug.  I think it is in nslookup (which is why I'm asking here), but there are so many pieces required to reproduce it that I cannot tell for sure.  Let me explain my setup:

All hosts are running Debian bullseye.  None of the problems happened *until* I upgraded from buster.

Host A (monitoring):
 - Installed: nagios4 (4.4.6-4), nrpe-ng (0.2.0-1)
 - IP address: 192.0.2.1

Host B (monitored):
 - Installed: nrpe-ng (0.2.0-1), monitoring-plugins-standard (2.3.1-1), bind9-dnsutils (9.16.27-1~deb11u1)
 - IP address: 192.0.2.2

Host C (monitored through host B):
 - Installed: bind9
 - IP address: 192.0.2.3
 - Configured to answer authoritatively for example.com on port 53.


I run the following on Host B:
$ /usr/lib/nagios/plugins/check_dns -H example.com -s 192.0.2.3 -A -w 0.1 -c 1.0
DNS OK: 0.070 seconds response time. example.com returns 192.0.2.10,2001:db8::10|time=0.069825s;0.100000;1.000000;0.000000

Then I run the following on Host B.  check_dns (part of monitoring-plugins-standard) invokes nslookup.  The response looks good.

When I run nslookup explicitly, it also looks good:
$ /usr/bin/nslookup -sil example.com 192.0.2.3
Server:		192.0.2.3
Address:	192.0.2.3#53

Name:	example.com
Address: 192.0.2.10
Name:	example.com
Address: 2001:db8::10

Now I set things up for monitoring using nrpe-ng with the following configuration:

                 nrpe
            over HTTPs                      DNS
Host A ------------------> Host B -------------> Host C


On Host B, I run the following:
sudo /usr/bin/python3 /usr/sbin/nrpe-ng --debug -f --config /etc/nagios/nrpe-ng.cfg

While that is running, I run the following on Host A:
/usr/lib/nagios/plugins/check_nrpe_ng -H 192.0.2.2 -c check_dns -a example.com 192.0.2.3 0.1 1.0

I can see the DNS request and response on the wire (i.e., using tcpdump).

The result of running the command on Host A is:
DNS CRITICAL - '/usr/bin/nslookup -sil' msg parsing exited with no address

On Host B, I see the following debug output:
200 POST /v1/check/check_dns (192.0.2.1) 78.05ms
Executing: /usr/lib/nagios/plugins/check_dns -H example.com -s 192.0.2.3 -A -w 0.1 -c 1.0

(The output matches what I manually ran to test earlier.)

After rerunning nrpe-ng with the following:
sudo strace --read=4 -F /usr/bin/python3 /usr/sbin/nrpe-ng --debug -f --config /etc/nagios/nrpe-ng.cfg

I see the following in the debug output on Host B:

[pid 1390861] read(4, "nslookup: ./src/unix/core.c:570:"..., 4096) = 83
 | 00000  6e 73 6c 6f 6f 6b 75 70  3a 20 2e 2f 73 72 63 2f  nslookup: ./src/ |
 | 00010  75 6e 69 78 2f 63 6f 72  65 2e 63 3a 35 37 30 3a  unix/core.c:570: |
 | 00020  20 75 76 5f 5f 63 6c 6f  73 65 3a 20 41 73 73 65   uv__close: Asse |
 | 00030  72 74 69 6f 6e 20 60 66  64 20 3e 20 53 54 44 45  rtion `fd > STDE |
 | 00040  52 52 5f 46 49 4c 45 4e  4f 27 20 66 61 69 6c 65  RR_FILENO' faile |
 | 00050  64 2e 0a                                          d..              |

So it appears that the nslookup process is reporting an error, specifically from this line of code:

https://github.com/libuv/libuv/blob/fb76f210eb6f093bc06a2f07646e56851818ccf2/src/unix/core.c#L602

However, I cannot reproduce it outside of nrpe-ng/check_dns/nslookup.  I need the help of someone more knowledgeable.  Thoughts?  Suggestions?

Thanks,
Casey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20220913/0f457b13/attachment-0001.htm>


More information about the bind-users mailing list