bind caching data from additional section in responses

btb at bitrate.net btb at bitrate.net
Thu Oct 6 19:46:16 UTC 2016


i have a scenario in which bind appears to be caching data that i wouldn't have expected it to, which is breaking resolution after it occurs.  i have a stub zone configured on server foo:

zone "example.com" {
	type stub;
	masters {
		"example.com" ;
	};
};

masters "example.com" {
	192.168.81.50 ;
};

foo is running bind 9.9.5, on ubuntu 14.04.4 [due shortly for an upgrade].

foo has an ip address of 192.168.81.61/24 - on the same broadcast domain as 192.168.81.50/24.  192.168.81.50 [ns1.example.com] serves an "internal" view of example.com to 192.168.81.0/24 [and thus foo].  additionally, example.net lists ns1.example.com as one of its nameservers.  when foo looks up ns records for example.net, it gets back the public ip for ns1.example.com in the additional section of the reply.  it appears that this information is then cached, rather than doing the work to actually look up the ip address for ns1.example.com from an authoritative source [and thus following the stub zone].

when the cache is empty, and foo is queried for ns1.example.com, it succeeds, and has the expected outcome, as well as the subsequent query:

>rndc flush

>dig @localhost ns1.example.com

; <<>> DiG 9.9.5-3ubuntu0.8-Ubuntu <<>> @localhost ns1.example.com
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49861
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;ns1.example.com.		IN	A

;; ANSWER SECTION:
ns1.example.com.	300	IN	A	192.168.81.50

;; AUTHORITY SECTION:
example.com.		300	IN	NS	ns1.example.com.

;; Query time: 1046 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Oct 06 14:15:39 EDT 2016
;; MSG SIZE  rcvd: 76

>dig @localhost example.net ns

; <<>> DiG 9.9.5-3ubuntu0.8-Ubuntu <<>> @localhost example.net ns
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7918
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 7

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;example.net.			IN	NS

;; ANSWER SECTION:
example.net.		300	IN	NS	ns1.he.net.
example.net.		300	IN	NS	ns3.he.net.
example.net.		300	IN	NS	ns2.he.net.
example.net.		300	IN	NS	ns1.example.com.

;; ADDITIONAL SECTION:
ns1.he.net.		172800	IN	A		216.218.130.2
ns1.example.com.	296	IN	A		192.168.81.50
ns2.he.net.		172800	IN	A		216.218.131.2
ns2.he.net.		172800	IN	AAAA	2001:470:200::2
ns3.he.net.		172800	IN	A		216.218.132.2
ns3.he.net.		172800	IN	AAAA	2001:470:300::2

;; Query time: 220 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Oct 06 14:15:43 EDT 2016
;; MSG SIZE  rcvd: 245

however, when the queries are reversed, the external address for ns1.example.com is returned, and then the subsequent query fails [foo cannot talk to ns1.example.com via the public address - e.g. the "nat loopback" phenomenon]:

"bad" data queried first:

>rndc flush

>dig @localhost example.net ns

; <<>> DiG 9.9.5-3ubuntu0.8-Ubuntu <<>> @localhost example.net ns
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23620
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 7

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;example.net.			IN	NS

;; ANSWER SECTION:
example.net.		300	IN	NS	ns1.he.net.
example.net.		300	IN	NS	ns1.example.com.
example.net.		300	IN	NS	ns3.he.net.
example.net.		300	IN	NS	ns2.he.net.

;; ADDITIONAL SECTION:
ns1.he.net.			172800	IN	A		216.218.130.2
ns1.example.com.	172800	IN	A		192.0.2.1
ns2.he.net.			172800	IN	A		216.218.131.2
ns2.he.net.			172800	IN	AAAA	2001:470:200::2
ns3.he.net.			172800	IN	A		216.218.132.2
ns3.he.net.			172800	IN	AAAA	2001:470:300::2

;; Query time: 282 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Oct 06 14:12:37 EDT 2016
;; MSG SIZE  rcvd: 245

>dig @localhost ns1.example.com

; <<>> DiG 9.9.5-3ubuntu0.8-Ubuntu <<>> @localhost ns1.example.com
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 16683
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;ns1.example.com.		IN	A

;; Query time: 4008 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Oct 06 14:12:50 EDT 2016
;; MSG SIZE  rcvd: 46

a brief inspection of the cache seems to corroborate this:

>rm named_dump.db 

>rndc flush

>dig @localhost example.net ns

; <<>> DiG 9.9.5-3ubuntu0.8-Ubuntu <<>> @localhost example.net ns
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13961
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 7

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;example.net.			IN	NS

;; ANSWER SECTION:
example.net.		300	IN	NS	ns1.he.net.
example.net.		300	IN	NS	ns1.example.com.
example.net.		300	IN	NS	ns2.he.net.
example.net.		300	IN	NS	ns3.he.net.

;; ADDITIONAL SECTION:
ns1.he.net.		172799	IN	A	216.218.130.2
ns1.example.com.	172799	IN	A	192.0.2.1
ns2.he.net.		172799	IN	A	216.218.131.2
ns2.he.net.		172799	IN	AAAA	2001:470:200::2
ns3.he.net.		172799	IN	A	216.218.132.2
ns3.he.net.		172799	IN	AAAA	2001:470:300::2

;; Query time: 1393 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Oct 06 14:17:04 EDT 2016
;; MSG SIZE  rcvd: 245

>rndc dumpdb

>grep -iF -B 5 -A 5 'example.com' named_dump.db 
[...]
; answer
example.net.		292	NS	ns1.he.net.
			292	NS	ns1.example.com.
			292	NS	ns2.he.net.
			292	NS	ns3.he.net.
[...]
; glue
ns1.example.com.	172791	A	192.0.2.1
; glue
[...]

is my perception accurate?  is bind caching the data it got back in the additional section, for a name outside of the queried zone?  if so, why?  how can i tell it to not do this?  enabling "nat loopback" would "fix" this, but imho, to put it diplomatically, that is inelegant at best, and i'd prefer not to.

thanks
-ben



More information about the bind-users mailing list