bind replication between sites

Thu Jul 27 21:08:50 UTC 2006

On Jul 26, 2006, at 6:07 PM, Dave Henderson wrote:

> If I replace the ns1 and ns1 with the .local extension, wouldn't  
> there be a problem when people from outside our intranet try to  
> resolve names that we host (this is what I am gathering from your  
> comments below - just confirming)?

You may be correct. I don't know how you have this set up, so I  
couldn't say for sure.

In that case, a better approach is to get rid of the CNAME records  
and replace them with A records.

> These name servers are used to resolve zone names from external  
> sources and they can already do that.  Is there a reason you say  
> that NS servers can't be CNAME records or is it just good practice  
> not to?

It's in the DNS rulebook - an alias may not be used in the data of  
any record except another CNAME record.

> I am guessing this is why the notify functionality is not working,  
> but the refresh cycle is.

That is correct.

Chris Buxton
Men & Mice

> Chris Buxton <cbuxton at menandmice.com> wrote:
> Sorry, I mistyped. What I meant to say was, "If 192.168.1.31 is the  
> master, then...". Since this is the slave, that part is fine.
>
> If this is purely private data, not meant for the public to see,  
> and if digital-pipe.local can universally be resolved on your  
> private LAN, then there's only one problem: An NS record may not  
> refer to an alias. Either replace the two CNAME records with A  
> records or change the NS records (in every zone) to point to ns1  
> and ns2.digital-pipe.local instead of .com.
>
> Then (if there are no other problems), DNS notify should start to  
> work for you.
>
> Chris Buxton
> Men & Mice
>
> On Jul 26, 2006, at 1:49 PM, Dave Henderson wrote:
>
>> I queried the slave server (192.168.1.31).  Why would that output  
>> be a problem from the slave server?
>>
>> I do run a .local domain as the domain that all servers and  
>> clients are a part of.  All the other domains (.com, .net., etc)  
>> are just dns zones for websites.
>>
>> I am the admin for these servers so some of it is learning, but  
>> some is not.  Why would having a CNAME for those records be wrong  
>> or bad?  The only problem I can see would be they would resolve to  
>> internal IPs (ie 192.168.x.x) instead of external IPs.  That can  
>> be changed with no problem.  Can you tell me what other levels  
>> this is a bad idea?  I am always open to learning.  :)
>>
>> Thanks,
>>
>> Dave
>>
>>
>>
>> Chris Buxton <cbuxton at menandmice.com> wrote:
>> Which server did you query, master or slave? If 192.168.1.31, then  
>> you have a real problem, but it doesn't sound like that's the case.
>>
>> I see a .local name among your name servers. Is this a public or  
>> private zone?
>>
>> I did some 'dig'-ing and found the problem with DNS notify: Your  
>> servers have a zone named digital-pipe.com. But that zone doesn't  
>> have A records for ns1.digital-pipe.com and ns2.digital-pipe.com.  
>> Instead, your version of the zone has CNAME records pointing into  
>> the .local version of the domain name.
>>
>> This is wrong on multiple levels. You should probably consider  
>> hiring a consultant to fix this, unless you're doing this as a  
>> learning exercise. In such case, you should consider training.
>>
>> Chris Buxton
>> Men & Mice
>>
>> On Jul 24, 2006, at 5:59 AM, Dave Henderson wrote:
>>
>>> If that is how it is supposed to be working, then you are right,  
>>> mine is not.  I only seemed to work using the refresh system.   
>>> Here is what I get when typing 'dig esesen.com axfr' on the slave  
>>> that is not receiving the updates:
>>>
>>>
>>> ns1:/var/cache/bind# dig esesen.com axfr
>>>
>>> ; <<>> DiG 9.2.4 <<>> esesen.com axfr
>>> ;; global options:  printcmd
>>> esesen.com.             604800  IN      SOA     mastns0.digital- 
>>> pipe.local. support.digital-pipe.com. 4 1200 600 604800 1200
>>> esesen.com.             604800  IN      NS      ns1.digital- 
>>> pipe.com.
>>> esesen.com.             604800  IN      NS      ns2.digital- 
>>> pipe.com.
>>> esesen.com.             604800  IN      NS      mastns0.digital- 
>>> pipe.local.
>>> test.esesen.com.        604800  IN      CNAME   www.esesen.com.
>>> www.esesen.com.         604800  IN      A       70.46.29.218
>>> www.esesen.com.         604800  IN      A       70.119.167.222
>>> esesen.com.             604800  IN      SOA     mastns0.digital- 
>>> pipe.local. support.digital-pipe.com. 4 1200 600 604800 1200
>>> ;; Query time: 5 msec
>>> ;; SERVER: 192.168.1.31#53(192.168.1.31)
>>> ;; WHEN: Mon Jul 24 08:51:05 2006
>>> ;; XFR size: 8 records
>>>
>>>
>>> It is showing the serial as 4, but the master contains a 5.  The  
>>> other issue is, I am trying to change what port is used to do all  
>>> this.  I can change it back to 53 for testing purposes, but would  
>>> like to get that switched to a different port as well.  Any ideas?
>>>
>>> Dave
>>>
>>>
>>> Chris Buxton <cbuxton at menandmice.com> wrote:
>>> The refresh system of pulling updates on a schedule is antiquated,
>>> having been superseded by the notify system. The refresh system  
>>> is of
>>> course still there, but usually now acts as a backup. The reason the
>>> notify mechanism is better is, it doesn't require frequent refresh
>>> checks from the slave, and updates made to the master are replicated
>>> to the slave usually within a few seconds.
>>>
>>> Normally, when you load an updated zone into memory (or, in the case
>>> of dynamic zones, when you send an update), the master server  
>>> sends a
>>> notification of the update to the slave. The following conditions
>>> have to be met for the slave to get the update:
>>>
>>> - The packet has to be received by the slave.
>>> - The source address of the packet, as received by the slave, must
>>> equal the master server IP in the slave's zone statement.
>>> - The packet must relate to a zone that the slave believes it should
>>> be authoritative for.
>>>
>>> If these conditions are met, the slave will initiate a refresh check
>>> ahead of schedule. That is, it will ask the master server for the
>>> zone's SOA record, compare serial numbers, and then go and get the
>>> zone transfer.
>>>
>>> You already know the last stage is working, because the scheduled
>>> refresh check works fine. Of the three conditions, you can be pretty
>>> sure the last one is not the problem. So that just leaves networking
>>> and firewalling issues as potential problems.
>>>
>>> Chris Buxton
>>> Men & Mice
>>>
>>> On Jul 23, 2006, at 7:30 AM, Dave Henderson wrote:
>>>
>>> > Thanks for the reply Chris. I am confused as to why changing the
>>> > refresh cycle wouldn't be the simple fix the to problem. Isn't
>>> > that how long a slave is supposed to wait before it contacts the
>>> > master to see if updates are available? If so, then that is why
>>> > mine wasn't working (it was set for 7 days).
>>> >
>>> > I have two slave servers. One is on the same network segment as
>>> > the master and the other is at a remote site. The firewalls are
>>> > configured to forward DNS traffic to the correct servers and
>>> > everything seems to be working now that the values are much lower.
>>> >
>>> > My SOA does show the master BIND server name (which is resolvable
>>> > on both slaves).
>>> >
>>> > Thanks again for your help and any other advice you have.
>>> >
>>> > Dave
>>> >
>>> > Chris Buxton wrote: The refresh, retry,
>>> > and expire timers shouldn't matter as long as
>>> > notify is working.
>>> >
>>> > BTW: Don't set expire as low as refresh, or you risk having your
>>> > slave take itself offline for no reason. In most cases, expire  
>>> should
>>> > be at least a week, regardless of refresh and retry.
>>> >
>>> > Your problem is that DNS Notify is not working. This can be  
>>> hidden by
>>> > setting refresh and retry to very low values, but it's better to
>>> > figure out why notify is not working.
>>> >
>>> > Are your servers behind a NAT firewall, on private addresses? That
>>> > will often cause notify to fail, since the master will notify the
>>> > slave's public address, but the NAT server can't handle  
>>> internal-to-
>>> > internal communication through NAT'd addresses. This can be worked
>>> > around by setting also-notify in your zone statements, listing the
>>> > slave server.
>>> >
>>> > Are your zone's NS records correct? Does your SOA show your master
>>> > server as the mname, or the slave? The master will not notify the
>>> > server listed by the SOA as the zone's master.
>>> >
>>> > Chris Buxton
>>> > Men & Mice
>>> >
>>> > On Jul 22, 2006, at 4:25 PM, Dave Henderson wrote:
>>> >
>>> >> Sten,
>>> >>
>>> >> After looking at the SOA portion, I noticed its numbers
>>> >> where way to high for what I was trying to accomplish. I have
>>> >> since adjusted them and everything seems to be working fine.
>>> >> Thanks to everyone who helped.
>>> >>
>>> >> Dave
>>> >>
>>> >>
>>> >> Sten Carlsen wrote: Ok, a few more
>>> >> questions:
>>> >> After making the change with your script, do you check the new  
>>> SOA
>>> >> with
>>> >> dig? (dig axfr) does it show the new incremented serial?
>>> >> And back to Kevins question what are the times involved in the  
>>> SOA,
>>> >> specially the refresh time. You could try to make that short,  
>>> like 5
>>> >> minutes for the experiment and see if that gets the zone updated
>>> >> when it
>>> >> expires.
>>> >>
>>> >> A couple of other things to consider: is axfr allowed from both
>>> >> slaves?
>>> >> is any firewall open for both UDP and TCP transfers with the  
>>> proper
>>> >> port
>>> >> numbers?
>>> >>
>>> >> If this does not ring a bell, I think you should publish the  
>>> relevant
>>> >> named.conf and zone files from both master and slaves here.  
>>> Please do
>>> >> not change anything, except keys for TSIG, rndc and like. Any  
>>> editing
>>> >> can inadvertently make changes that hide the real problems from
>>> >> view and
>>> >> draw attention into the wrong direction delaying the solution.  
>>> I am
>>> >> pretty sure that a lot of people on the list will look at the  
>>> real
>>> >> files
>>> >> and analyse them.
>>> >>
>>> >> Dave Henderson wrote:
>>> >>> Sten,
>>> >>>
>>> >>> You are correct in your first paragraph. I can create a brand  
>>> new
>>> >>> zone and it gets propagated to both slave servers. Any changes
>>> >>> made to that new zone do not get replicated. And I have been
>>> >>> increasing the serial number. I wrote a perl script that adds/ 
>>> del/
>>> >>> updates the record and increases the serial number. I verified
>>> >>> all content added/deleted/updated via the script is correct, but
>>> >>> still no slave servers get updated.
>>> >>>
>>> >>> The reason the serial numbers are low is because they are the
>>> >>> ones listed on the slave server. The ones on the master are  
>>> up to
>>> >>> 17. There shouldn't be a problem at all.
>>> >>>
>>> >>> DHCP can be ruled out because these aren't clients creating
>>> >>> dynamic DNS entries (so no its not a stupid wintel client).
>>> >>>
>>> >>> Dave
>>> >>>
>>> >>>
>>> >>> Sten Carlsen wrote: As I read the original post:
>>> >>> from starting the slave server with no zone file on it, the
>>> >>> replication
>>> >>> takes place as it should and after a short there is a zone  
>>> file on
>>> >>> the
>>> >>> slave server. So far all ok.
>>> >>> Next, if a change is done on the master that change is not
>>> >>> propagated to
>>> >>> the slave server.
>>> >>>
>>> >>> Is this the correct description of the problem?
>>> >>>
>>> >>> If this is so, I would like to ask if you remember to  
>>> increase the
>>> >>> serial number in the master zone file? or use nsupdate or  
>>> dhcp to
>>> >>> update
>>> >>> the zone?
>>> >>>
>>> >>> I noticed the serial numbers from the log files being very low,
>>> >>> possibly
>>> >>> indicating that they were not incremented. This would present  
>>> the
>>> >>> exact
>>> >>> picture I explain at the top of this post.
>>> >>>
>>> >>> Kevin Darcy wrote:
>>> >>>
>>> >>>> Dave Henderson wrote:
>>> >>>>
>>> >>>>
>>> >>>>> Gang,
>>> >>>>>
>>> >>>>> I have three bind servers running. Two at site 1 (one master
>>> >>>>> and one slave) and the other at site 2. Replication of the  
>>> zone
>>> >>>>> file seems to take place, but when updates are made on the
>>> >>>>> master server, they don't get replicated to the slaves.
>>> >>>>>
>>> >>>>>
>>> >>>> I don't quite understand that sentence. The file is replicating
>>> >>>> but the
>>> >>>> changes aren't (???)
>>> >>>>
>>> >>>>
>>> >>>>> Here is a snippet from the log of the master if I delete the
>>> >>>>> file on a slave server:
>>> >>>>>
>>> >>>>> Jul 19 13:55:55 localhost named[8329]: client
>>> >>>>> 192.168.0.31#32936: transfer of 'esessen.org/IN': AXFR started
>>> >>>>>
>>> >>>>> and here it is on the slave server:
>>> >>>>>
>>> >>>>> Jul 19 13:55:53 localhost named[7857]: zone esessen.org/IN:
>>> >>>>> transferred serial 3
>>> >>>>> Jul 19 13:55:53 localhost named[7857]: transfer of
>>> >>>>> 'esessen.org/IN' from 192.168.0.11#53: end of transfer
>>> >>>>> Jul 19 13:55:53 localhost named[7857]: zone esessen.org/IN:
>>> >>>>> sending notifies (serial 3)
>>> >>>>>
>>> >>>>>
>>> >>>>> That all seems to work ok, but if I make change to a  
>>> domain, it
>>> >>>>> doesn't get replicated. There are no records on the master
>>> >>>>> server indicating a transfer at all. The slave contains:
>>> >>>>>
>>> >>>>> Jul 19 13:55:52 localhost named[7857]: zone  
>>> cliquesoftware.com/
>>> >>>>> IN: sending notifies (serial 2)
>>> >>>>>
>>> >>>>> The actual serial number on the master is 17. Here is the
>>> >>>>> master log (after a restart):
>>> >>>>>
>>> >>>>> Jul 19 11:19:41 localhost named[8329]: zone  
>>> cliquesoftware.com/
>>> >>>>> IN: loaded serial 17
>>> >>>>> Jul 19 11:19:41 localhost named[8329]: zone  
>>> cliquesoftware.com/
>>> >>>>> IN: sending notifies (serial 17)
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>> How long have you waited and what is the REFRESH setting on the
>>> >>>> zone? If
>>> >>>> there's something wrong with the NOTIFY mechanism for this  
>>> zone,
>>> >>>> then it
>>> >>>> could take up to REFRESH time for the changes to replicate.
>>> >>>>
>>> >>>> If NOTIFY is broken, then that could be tackled as a separate
>>> >>>> issue.
>>> >>>> Better to establish that normal REFRESH-timed replication works
>>> >>>> before
>>> >>>> getting into the arcana of NOTIFY.
>>> >>>>
>>> >>>>
>>> >>>>> I am getting the following on the master, but I don't have a
>>> >>>>> server or client using the following ip address:
>>> >>>>>
>>> >>>>> Jul 19 12:11:20 localhost named[8329]: client
>>> >>>>> 192.168.0.200#2679: updating zone 'digital-pipe.local/IN':
>>> >>>>> update failed: 'RRset exists (value dependent)' prerequisite
>>> >>>>> not satisfied (NXRRSET)
>>> >>>>> Jul 19 12:11:20 localhost named[8329]: client
>>> >>>>> 192.168.0.200#2682: update 'digital-pipe.local/IN' denied
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>> Probably just a stupid Wintel client that's misconfigured to
>>> >>>> register
>>> >>>> its name in DNS.
>>> >>>>
>>> >>>>
>>> >>>> - Kevin
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>
>>> >> --
>>> >> Best regards
>>> >>
>>> >> Sten Carlsen
>>> >>
>>> >> Let HIM who has an empty INBOX send the first mail.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>
>>
>
>