DNSSEC troubleshooting on a recursive server.

Grant Keller gkeller at corp.sonic.net
Tue Aug 6 23:09:08 UTC 2013


Hello,

We have 7 recursive DNS servers running Bind 9.9.2, and we are seeing
some strange behavoir validating DNSSEC. We have seen this happen a few
times, and in the past the problem has gone away when the server is
rebooted, so my first guess is that some record is stuck in the cache.
An example from one of the servers in question:

# dig a zygo.com @pdns02.domaincontrol.com +nocomments

; <<>> DiG 9.7.0-P2-RedHat-9.7.0-17.P2.el5_9.2 <<>> a zygo.com
@pdns02.domaincontrol.com +nocomments
;; global options: +cmd
;zygo.com.            IN    A
zygo.com.        86400    IN    A    50.28.48.60
zygo.com.        3600    IN    NS    pdns01.domaincontrol.com.
zygo.com.        3600    IN    NS    pdns02.domaincontrol.com.
;; Query time: 83 msec
;; SERVER: 208.109.255.50#53(208.109.255.50)
;; WHEN: Tue Aug  6 16:04:26 2013
;; MSG SIZE  rcvd: 98

# dig rrsig zygo.com @pdns02.domaincontrol.com +nocomments

; <<>> DiG 9.7.0-P2-RedHat-9.7.0-17.P2.el5_9.2 <<>> rrsig zygo.com
@pdns02.domaincontrol.com +nocomments
;; global options: +cmd
;zygo.com.            IN    RRSIG
zygo.com.        86400    IN    RRSIG    A 7 2 86400 20130812183056
20130728183056 19712 zygo.com.
FbuZDfcptJtbOCxsCV+U3uQA+ETkrvhKAJrpVhlVMAGrYhgFBHWTvsgK
8ZY9DP7Chr8rXF8BXjr0zh06Fi62RJQiRuytFLN117kqJjXe4g/5q4l3
O9XsuF2WeDj3TudMeqcb6hxGstly34gfec/RZdktlogmJTSu5+t3BdwP myU=
zygo.com.        3600    IN    RRSIG    NS 7 2 3600 20130812183056
20130728183056 19712 zygo.com.
YTqpH1q+wSZCUGhjw0qKWRBGSARInipMqUEOg0IaM49rgSSynYPDDt01
7XOCpOnlZXSuiGv42yac/b3Se4gGHOfdyOHRncjiSmwL5vYlVhCBqUS3
qgPSnqYonqC7uxaVg7tQm0ErZpWFJiMMdHfs/HpLTKq5tnZfHflCkhWj si4=
zygo.com.        3600    IN    RRSIG    SOA 7 2 3600 20130812183056
20130728183056 19712 zygo.com.
XDFuwBva0CzYYyXJIWI7HWWrFgK2GrhhOqb/fxtvDA7623WEb5DkROHg
nx1cfI7w585MU3R0P2ZmrAXKULMFaZ0i24WvWa+hZf/GpBaO9wYGm1oS
jWnUXpxNT15G/XXB91rVS0kCU4vEdLkVCXgh3k63QB+Drs0gfrPHjeSj Co8=
zygo.com.        86400    IN    RRSIG    MX 7 2 86400 20130812183056
20130728183056 19712 zygo.com.
dsRwujkNkm2P/lgBf9CfF5d1qzgaFYrQob5RDEXLYQkA2BkYd26yakQF
xb8doXp1q3AxxlQ8yZpyUUGZmT13Aw/IBm8hFMdy+PmSxDGqoveUeah9
dh3abPVrWlP+jbcLXVX9r5Lg5yVxXFAqplfmPj8fuupFJSkOEfMMB6P0 iMw=
zygo.com.        86400    IN    RRSIG    TXT 7 2 86400 20130812183056
20130728183056 19712 zygo.com.
LV05eG+KKxv1dLUvKL3xddiEtKuQ+gOM5dPFfAn6Qpzt+xg13E0rLvwR
wV3w9Ol10r2cbGZr5leQciXHNoJtRKo8gNuMdxOFu/F+vu3zZZDYvR2I
CrWrO5Acm7oVORllTs0gEIvYzXkmJErFEnwlc6uXENZlVEt08drmq0Lq 8nc=
zygo.com.        3600    IN    RRSIG    DNSKEY 7 2 3600 20130812183056
20130728183056 54396 zygo.com.
iZ5qg7HIuCb7N/0SCPPj0JRiNWBYLc8DupV2VSfjhv12fiqMvaLimDb+
xYaxFGaHzNySM6rgDfZf1sod5iCwaTUVXDwru/zgDoDv2PV5xYUZ0U9v
ubgiACKmJAE+uPe2CI5ECaLX6fzuKP5hrBIurk33jt0znauogIPyzpOP
y9woc4tSxlmllFWJcO6PUU0ZBrHESepxll+v7St9aMVCiGe8g22O8NPn
3JKazq8OHQPptGAY0TnqU0oZoDIiYY1oEscTGr2hOWdAh9Kz95rMRtfq
4L6aP63MnEIbYPUzzTbMiQqfZJkJshwfttnRTxlcZ+7/WDYl2YJVIR+S RtYsYA==
zygo.com.        3600    IN    RRSIG    NSEC3PARAM 7 2 3600
20130812183056 20130728183056 19712 zygo.com.
Zt+Bak9VK/apMNCXmPxUdYtIdKJtVo5IwMtnuYv8SgZMOPZIvl2ROD1y
Ra48JWEeQ3vMErRt0BsJPwl4Y3a6auM6tZMxhG+Ja6ZWoL2IaMcgGpct
CW9Pl8hUIykRcL4QfzyPlQM6o8ZwSuhAAPw2+7N9dvhSWzPT6IKq9B2T DQQ=
zygo.com.        3600    IN    NS    pdns01.domaincontrol.com.
zygo.com.        3600    IN    NS    pdns02.domaincontrol.com.
;; Query time: 83 msec
;; SERVER: 208.109.255.50#53(208.109.255.50)
;; WHEN: Tue Aug  6 16:05:13 2013
;; MSG SIZE  rcvd: 1386

That is the correct answer from the auth name server. When I query the
local server, I get this:

# dig a zygo.com @127.0.0.1 +nocomments

; <<>> DiG 9.7.0-P2-RedHat-9.7.0-17.P2.el5_9.2 <<>> a zygo.com
@127.0.0.1 +nocomments
;; global options: +cmd
;zygo.com.            IN    A
;; Query time: 162 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Aug  6 16:06:10 2013
;; MSG SIZE  rcvd: 26

# dig rrsig zygo.com @127.0.0.1 +nocomments

; <<>> DiG 9.7.0-P2-RedHat-9.7.0-17.P2.el5_9.2 <<>> rrsig zygo.com
@127.0.0.1 +nocomments
;; global options: +cmd
;zygo.com.            IN    RRSIG
zygo.com.        5    IN    RRSIG    DS 8 2 86400 20130811043747
20130804032747 8795 com.
cKYDb9z9EcoVHk4AWohaECz7LwphvX+LGqinfh2H6ZeWz6oWWFMGs8Pc
ZAYwh63e7+czbwhfy1LALwBKVRh9ijyg43NW0Ag7ZamQ56yc5k27UiuR
x9skNeOLe+CDpfYM9LwbEnPKG2bJhAXAZ9lZEPT/seB5ID23HBwy9jfy wig=
zygo.com.        153315    IN    NS    pdns02.domaincontrol.com.
zygo.com.        153315    IN    NS    pdns01.domaincontrol.com.
pdns01.domaincontrol.com. 4258    IN    A    216.69.185.50
pdns01.domaincontrol.com. 6156    IN    AAAA    2607:f208:207::32
pdns02.domaincontrol.com. 43034    IN    A    208.109.255.50
pdns02.domaincontrol.com. 3041    IN    AAAA    2607:f208:303::32
;; Query time: 80 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Aug  6 16:06:41 2013
;; MSG SIZE  rcvd: 333

The thing that really confuses me is that the ttl on the RRSIG DS record
has been stuck at 5 for about a day now. I tried doing a rndc flushname
zygo.com, which did not help. What else can I do to troubleshoot this,
and if it is a cache problem, what can I do to clear the records? Thanks.



-- 
Grant Keller
Sonic.net System Operations



More information about the bind-users mailing list