Bad CNAME treatment consistency beetween direct CNAME request vs A request

Emmanuel Fusté manu.fuste at gmail.com
Fri May 13 13:30:59 UTC 2022


Hello,
I've had a hard time identifying the source of intermittent name
resolution failure for a customer.
The source of the problem is a DNS spec violation with a RRSET with
multiple CNAME:

 dig  @ns-29-b.gandi.net CNAME lb.qual.flash-global.net

; <<>> DiG 9.18.2-1+ubuntu20.04.1+isc+3-Ubuntu <<>> @ns-29-b.gandi.net
CNAME lb.qual.flash-global.net
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42945
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;lb.qual.flash-global.net.      IN      CNAME

;; ANSWER SECTION:
lb.qual.flash-global.net. 10800 IN      CNAME   lb1.qual.flash-global.net.
lb.qual.flash-global.net. 10800 IN      CNAME   lb2.qual.flash-global.net.

;; Query time: 10 msec
;; SERVER: 213.167.230.30#53(ns-29-b.gandi.net) (UDP)
;; WHEN: Fri May 13 15:03:00 CEST 2022
;; MSG SIZE  rcvd: 89

If I try the resolution via my Bind (9.18.2) resolver, cache cold, it
properly return a SERVFAIL:
  dig  @172.29.0.36 +dnssec +cd CNAME lb.qual.flash-global.net

; <<>> DiG 9.18.2-1+ubuntu20.04.1+isc+3-Ubuntu <<>> @172.29.0.36
+dnssec +cd CNAME lb.qual.flash-global.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 24053
;; flags: qr rd ra cd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
; COOKIE: 23ac9b539bf16ad001000000627e57c0b7d630e657322232 (good)
;; QUESTION SECTION:
;lb.qual.flash-global.net.      IN      CNAME

;; Query time: 30 msec
;; SERVER: 172.29.0.36#53(172.29.0.36) (UDP)
;; WHEN: Fri May 13 15:06:09 CEST 2022
;; MSG SIZE  rcvd: 81

because the authoritative answer is correctly identified as invalid:
named[147998]: FORMERR resolving 'lb.qual.flash-global.net/CNAME/IN':
213.167.230.30#53
named[147998]: FORMERR resolving 'lb.qual.flash-global.net/CNAME/IN':
217.70.187.161#53
named[147998]: FORMERR resolving 'lb.qual.flash-global.net/CNAME/IN':
173.246.100.82#53

Google DNS returns the same.

If I do a A request, I get an (unexpected in my opinion) answer:
dig  @172.29.0.36 +dnssec +cd A lb.qual.flash-global.net

; <<>> DiG 9.18.2-1+ubuntu20.04.1+isc+3-Ubuntu <<>> @172.29.0.36
+dnssec +cd A lb.qual.flash-global.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26546
;; flags: qr rd ra cd; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
; COOKIE: b5755aa921e65a4401000000627e58a481dbcf3655737b6b (good)
;; QUESTION SECTION:
;lb.qual.flash-global.net.      IN      A

;; ANSWER SECTION:
lb.qual.flash-global.net. 10800 IN      CNAME   lb1.qual.flash-global.net.
lb.qual.flash-global.net. 10800 IN      RRSIG   CNAME 13 4 10800
20220526000000 20220505000000 57605 flash-global.net.
NVDmeCSKkx998LRnmiB6hWz4PdZJ5WPG6CCrDTSP587pLUxxoxeNlCmJ
l8l0p8/l8o+ZmZr1EXqxUA1FXpGbGw==
lb1.qual.flash-global.net. 600  IN      A       51.68.158.37
lb1.qual.flash-global.net. 600  IN      RRSIG   A 13 4 600
20220526000000 20220505000000 57605 flash-global.net.
G1YUaDtWVGxj5NbA18crQ912tW/VWra49wi3U1EeRio9kId+2mwo7Vuj
GH8adlvvjQyps7IBtj9gYVmbewN+GQ==

;; Query time: 30 msec
;; SERVER: 172.29.0.36#53(172.29.0.36) (UDP)
;; WHEN: Fri May 13 15:09:57 CEST 2022
;; MSG SIZE  rcvd: 339

Google DNS  do the same

BUT

Now on my side I have cache pollution as a new CNAME request  give me

dig  @172.29.0.36 +dnssec +cd CNAME lb.qual.flash-global.net

; <<>> DiG 9.18.2-1+ubuntu20.04.1+isc+3-Ubuntu <<>> @172.29.0.36
+dnssec +cd CNAME lb.qual.flash-global.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42637
;; flags: qr rd ra cd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
; COOKIE: ea748ef065e32df101000000627e59947b2e1424679d72f2 (good)
;; QUESTION SECTION:
;lb.qual.flash-global.net.      IN      CNAME

;; ANSWER SECTION:
lb.qual.flash-global.net. 10560 IN      CNAME   lb1.qual.flash-global.net.
lb.qual.flash-global.net. 10560 IN      RRSIG   CNAME 13 4 10800
20220526000000 20220505000000 57605 flash-global.net.
NVDmeCSKkx998LRnmiB6hWz4PdZJ5WPG6CCrDTSP587pLUxxoxeNlCmJ
l8l0p8/l8o+ZmZr1EXqxUA1FXpGbGw==

;; Query time: 20 msec
;; SERVER: 172.29.0.36#53(172.29.0.36) (UDP)
;; WHEN: Fri May 13 15:13:56 CEST 2022
;; MSG SIZE  rcvd: 211

until I issue a rndc flush command.
This cache pollution is bad and seems to not happen on the google side
(but there are many DNS behind 8.8.8.8).

I would have expected a SERVFAIL/FORMERR in the A request case. Even
if I could understand a conservative approach from the Google side, I
don't buy it for Bind and expect a configuration directive to reject
it.
If this (the A case) is an expected behavior for Bind, I think that
the cache pollution is not and should be fixed.

am I wrong ?

The question of whether Gandi should correct the fact of being
able/allow to declare several CNAMEs on an entry and how to contact
them to fix this is more a question for dns-operation.

Emmanuel.


More information about the bind-users mailing list