Unexplainable Problem with Upgrade 8.2.3-REL to 9.1.2

Barry Finkel b19141 at achilles.ctd.anl.gov
Fri Jun 15 14:05:10 UTC 2001


Kevin Darcy wrote in reply to my posting:
>> 
>> Are you a slave for aps.anl.gov? BIND 8 used to "mix glue" between parent
>> and child zones, on zone transfers. My guess is that the delegations from
>> aps.anl.gov to ser.aps.anl.gov are wrong, but you never noticed it before
>> because there was enough glue from the child zone for your nameserver(s)
>> to use. BIND 9, on the other hand, enforces strict zone-cut separation,
>> so immediately after the upgrade you were seeing the "raw", incorrect
>> delegation information in the zone transfer. Now you've covered up that
>> problem by making yourself authoritative for the child zone. But the
>> delegations should be fixed.
>> 
>> BTW, if this is the externally-visible ser.aps.anl.gov you're talking
>> about, it definitely has delegation problems. nsx.lbl.gov is answering
>> non-authoritatively for ser.aps.anl.gov and ns2.es.net returns NXDOMAIN!
>> 
>> - Kevin
> 
[snip]

James A Griffin <agriffin at cpcug.org> replied:

>Barry, here is some detail on Kevin's observation about delegation
>problems.  I'll send you the full log if you want.  You can get a copy
>of 'doc' from ftp://ftp.shub-internet.org/brad/dns/
>
>Jim
>
>Doc-2.2.2: doc -v ser.aps.anl.gov.
>Doc-2.2.2: Starting test of ser.aps.anl.gov.   parent is aps.anl.gov.
>Doc-2.2.2: Test date - Thu Jun 14 17:23:35 EDT 2001
>soa @dns1.anl.gov. for aps.anl.gov. has serial: 3106240
>soa @dns2.anl.gov. for aps.anl.gov. has serial: 3106240
>soa @ns2.es.net. for aps.anl.gov. has serial: 3106240
>soa @nsx.lbl.gov. for aps.anl.gov. has serial: 3106240
>soa @oxygen.aps.anl.gov. for aps.anl.gov. has serial: 3106241
>WARNING: Found 2 unique SOA serial #'s for aps.anl.gov.
>Found 5 NS and 5 glue records for ser.aps.anl.gov. @dns1.anl.gov. (AUTH)
>Found 5 NS and 5 glue records for ser.aps.anl.gov. @dns2.anl.gov. (AUTH)
>Found 0 NS and 0 glue records for ser.aps.anl.gov. @ns2.es.net. (AUTH)
>Found 5 NS and 5 glue records for ser.aps.anl.gov. @nsx.lbl.gov.
>(non-AUTH)
>Found 5 NS and 5 glue records for ser.aps.anl.gov. @oxygen.aps.anl.gov.
>(AUTH)
>DNServers for aps.anl.gov.
>   === 4 were also authoritatve for ser.aps.anl.gov.
>   === 1 were non-authoritative for ser.aps.anl.gov.
>ERROR: Found 2 diff sets of NS records
>   === from servers authoritative for ser.aps.anl.gov.
>NS list summary for ser.aps.anl.gov. from parent (aps.anl.gov.) servers
>  == dns1.anl.gov. dns2.anl.gov. ns2.es.net.
>  == nsx.lbl.gov. oxygen.aps.anl.gov.
>soa @dns1.anl.gov. for ser.aps.anl.gov. serial: 2100110
>soa @dns2.anl.gov. for ser.aps.anl.gov. serial: 2100110
>soa @ns2.es.net. for ser.aps.anl.gov. serial:
>ERROR: no SOA record for ser.aps.anl.gov. from ns2.es.net.
>soa @nsx.lbl.gov. for ser.aps.anl.gov. serial:
>ERROR: no SOA record for ser.aps.anl.gov. from nsx.lbl.gov.
>soa @oxygen.aps.anl.gov. for ser.aps.anl.gov. serial: 2100110
>SOA serial #'s agree for ser.aps.anl.gov.
>Authoritative domain (ser.aps.anl.gov.) servers agree on NS for
>ser.aps.anl.gov.NS list from ser.aps.anl.gov. authoritative servers
>matches list from
>  === parent (aps.anl.gov.) servers not authoritative for
>ser.aps.anl.gov.
>Checking 0 potential addresses for hosts at ser.aps.anl.gov.
>  ==
>Summary:
>   ERRORS found for ser.aps.anl.gov. (count: 3)
>   WARNINGS issued for ser.aps.anl.gov. (count: 1)
>Done testing ser.aps.anl.gov.  Thu Jun 14 17:23:50 EDT 2001

I am not sure that this output tells me what happened.  Here is the
scenario, as far as I can tell.

   1) There is a master dns (oxygen.aps.anl.gov) that is the master for
      aps.anl.gov and all of its subdomains.

   2) There are four BIND 8.2.3-REL slave servers:

           dns1.anl.gov
           dns2.anl.gov
           nsx.lbl.gov
           ns2.es.net

   3) I do not know how the master is configured for the zone

           ser.aps.anl.gov

      The hostmaster there told me that it has been in its own zone
      for a long time.

   4) The four slaves did not have a separate entry in named.conf for

           ser.aps.anl.gov

      but we had no problems resolving nodenames within that domain.

   5) When the oxygen.aps.anl.gov name server was upgraded from 8.2.3
      to 9.1.2, the slaves could no longer resolve names within the
    
           ser.aps.anl.gov

      domain.  I fixed this problem by adding the definition for that
      domain to named.conf on dns1 and dns2.  I asked our two offsite
      slaves to do the same.  LBL has made the change; ESnet has not.

I do not see anything in the output of doc-2.2.2 (I had run doc-2.1.4
here) that has any indication of errors other than the fact that
the ESnet name server has not added the new zone to its named.conf
file.

Before the upgrade, if a DNS query for

     www.ser.aps.anl.gov

had arrived at one of the slaves, the slave would have returned NXDOMAIN
unless the entry was in the parent aps.anl.gov domain.  I do not see
how the slave could have forwarded the query to oxygen, the master
server.  So, I have to conclude that there either

     1) the ser.aps zone was contained within the parent zone, or

     2) there were some glue record(s) in the aps zone for ser.aps.
        BIND 9.1.2 may have complained about these glue record(s),
        so they were removed from the zone to get it to load correctly.

----------------------------------------------------------------------
Barry S. Finkel
Electronics and Computing Technologies Division
Argonne National Laboratory          Phone:    +1 (630) 252-7277
9700 South Cass Avenue               Facsimile:+1 (630) 252-9689
Building 221, Room B236              Internet: BSFinkel at anl.gov
Argonne, IL   60439-4844             IBMMAIL:  I1004994



More information about the bind-users mailing list