bind replication between sites

Chris Buxton cbuxton at menandmice.com
Wed Jul 26 20:40:26 UTC 2006


Which server did you query, master or slave? If 192.168.1.31, then  
you have a real problem, but it doesn't sound like that's the case.
I see a .local name among your name servers. Is this a public or  
private zone?

I did some 'dig'-ing and found the problem with DNS notify: Your  
servers have a zone named digital-pipe.com. But that zone doesn't  
have A records for ns1.digital-pipe.com and ns2.digital-pipe.com.  
Instead, your version of the zone has CNAME records pointing into  
the .local version of the domain name.

This is wrong on multiple levels. You should probably consider hiring  
a consultant to fix this, unless you're doing this as a learning  
exercise. In such case, you should consider training.

Chris Buxton
Men & Mice

On Jul 24, 2006, at 5:59 AM, Dave Henderson wrote:

> If that is how it is supposed to be working, then you are right,  
> mine is not.  I only seemed to work using the refresh system.  Here  
> is what I get when typing 'dig esesen.com axfr' on the slave that  
> is not receiving the updates:
>
>
> ns1:/var/cache/bind# dig esesen.com axfr
>
> ; <<>> DiG 9.2.4 <<>> esesen.com axfr
> ;; global options:  printcmd
> esesen.com.             604800  IN      SOA     mastns0.digital- 
> pipe.local. support.digital-pipe.com. 4 1200 600 604800 1200
> esesen.com.             604800  IN      NS      ns1.digital-pipe.com.
> esesen.com.             604800  IN      NS      ns2.digital-pipe.com.
> esesen.com.             604800  IN      NS      mastns0.digital- 
> pipe.local.
> test.esesen.com.        604800  IN      CNAME   www.esesen.com.
> www.esesen.com.         604800  IN      A       70.46.29.218
> www.esesen.com.         604800  IN      A       70.119.167.222
> esesen.com.             604800  IN      SOA     mastns0.digital- 
> pipe.local. support.digital-pipe.com. 4 1200 600 604800 1200
> ;; Query time: 5 msec
> ;; SERVER: 192.168.1.31#53(192.168.1.31)
> ;; WHEN: Mon Jul 24 08:51:05 2006
> ;; XFR size: 8 records
>
>
> It is showing the serial as 4, but the master contains a 5.  The  
> other issue is, I am trying to change what port is used to do all  
> this.  I can change it back to 53 for testing purposes, but would  
> like to get that switched to a different port as well.  Any ideas?
>
> Dave
>
>
> Chris Buxton <cbuxton at menandmice.com> wrote:
> The refresh system of pulling updates on a schedule is antiquated,
> having been superseded by the notify system. The refresh system is of
> course still there, but usually now acts as a backup. The reason the
> notify mechanism is better is, it doesn't require frequent refresh
> checks from the slave, and updates made to the master are replicated
> to the slave usually within a few seconds.
>
> Normally, when you load an updated zone into memory (or, in the case
> of dynamic zones, when you send an update), the master server sends a
> notification of the update to the slave. The following conditions
> have to be met for the slave to get the update:
>
> - The packet has to be received by the slave.
> - The source address of the packet, as received by the slave, must
> equal the master server IP in the slave's zone statement.
> - The packet must relate to a zone that the slave believes it should
> be authoritative for.
>
> If these conditions are met, the slave will initiate a refresh check
> ahead of schedule. That is, it will ask the master server for the
> zone's SOA record, compare serial numbers, and then go and get the
> zone transfer.
>
> You already know the last stage is working, because the scheduled
> refresh check works fine. Of the three conditions, you can be pretty
> sure the last one is not the problem. So that just leaves networking
> and firewalling issues as potential problems.
>
> Chris Buxton
> Men & Mice
>
> On Jul 23, 2006, at 7:30 AM, Dave Henderson wrote:
>
> > Thanks for the reply Chris. I am confused as to why changing the
> > refresh cycle wouldn't be the simple fix the to problem. Isn't
> > that how long a slave is supposed to wait before it contacts the
> > master to see if updates are available? If so, then that is why
> > mine wasn't working (it was set for 7 days).
> >
> > I have two slave servers. One is on the same network segment as
> > the master and the other is at a remote site. The firewalls are
> > configured to forward DNS traffic to the correct servers and
> > everything seems to be working now that the values are much lower.
> >
> > My SOA does show the master BIND server name (which is resolvable
> > on both slaves).
> >
> > Thanks again for your help and any other advice you have.
> >
> > Dave
> >
> > Chris Buxton wrote: The refresh, retry,
> > and expire timers shouldn't matter as long as
> > notify is working.
> >
> > BTW: Don't set expire as low as refresh, or you risk having your
> > slave take itself offline for no reason. In most cases, expire  
> should
> > be at least a week, regardless of refresh and retry.
> >
> > Your problem is that DNS Notify is not working. This can be  
> hidden by
> > setting refresh and retry to very low values, but it's better to
> > figure out why notify is not working.
> >
> > Are your servers behind a NAT firewall, on private addresses? That
> > will often cause notify to fail, since the master will notify the
> > slave's public address, but the NAT server can't handle internal-to-
> > internal communication through NAT'd addresses. This can be worked
> > around by setting also-notify in your zone statements, listing the
> > slave server.
> >
> > Are your zone's NS records correct? Does your SOA show your master
> > server as the mname, or the slave? The master will not notify the
> > server listed by the SOA as the zone's master.
> >
> > Chris Buxton
> > Men & Mice
> >
> > On Jul 22, 2006, at 4:25 PM, Dave Henderson wrote:
> >
> >> Sten,
> >>
> >> After looking at the SOA portion, I noticed its numbers
> >> where way to high for what I was trying to accomplish. I have
> >> since adjusted them and everything seems to be working fine.
> >> Thanks to everyone who helped.
> >>
> >> Dave
> >>
> >>
> >> Sten Carlsen wrote: Ok, a few more
> >> questions:
> >> After making the change with your script, do you check the new SOA
> >> with
> >> dig? (dig axfr) does it show the new incremented serial?
> >> And back to Kevins question what are the times involved in the SOA,
> >> specially the refresh time. You could try to make that short,  
> like 5
> >> minutes for the experiment and see if that gets the zone updated
> >> when it
> >> expires.
> >>
> >> A couple of other things to consider: is axfr allowed from both
> >> slaves?
> >> is any firewall open for both UDP and TCP transfers with the proper
> >> port
> >> numbers?
> >>
> >> If this does not ring a bell, I think you should publish the  
> relevant
> >> named.conf and zone files from both master and slaves here.  
> Please do
> >> not change anything, except keys for TSIG, rndc and like. Any  
> editing
> >> can inadvertently make changes that hide the real problems from
> >> view and
> >> draw attention into the wrong direction delaying the solution. I am
> >> pretty sure that a lot of people on the list will look at the real
> >> files
> >> and analyse them.
> >>
> >> Dave Henderson wrote:
> >>> Sten,
> >>>
> >>> You are correct in your first paragraph. I can create a brand new
> >>> zone and it gets propagated to both slave servers. Any changes
> >>> made to that new zone do not get replicated. And I have been
> >>> increasing the serial number. I wrote a perl script that adds/del/
> >>> updates the record and increases the serial number. I verified
> >>> all content added/deleted/updated via the script is correct, but
> >>> still no slave servers get updated.
> >>>
> >>> The reason the serial numbers are low is because they are the
> >>> ones listed on the slave server. The ones on the master are up to
> >>> 17. There shouldn't be a problem at all.
> >>>
> >>> DHCP can be ruled out because these aren't clients creating
> >>> dynamic DNS entries (so no its not a stupid wintel client).
> >>>
> >>> Dave
> >>>
> >>>
> >>> Sten Carlsen wrote: As I read the original post:
> >>> from starting the slave server with no zone file on it, the
> >>> replication
> >>> takes place as it should and after a short there is a zone file on
> >>> the
> >>> slave server. So far all ok.
> >>> Next, if a change is done on the master that change is not
> >>> propagated to
> >>> the slave server.
> >>>
> >>> Is this the correct description of the problem?
> >>>
> >>> If this is so, I would like to ask if you remember to increase the
> >>> serial number in the master zone file? or use nsupdate or dhcp to
> >>> update
> >>> the zone?
> >>>
> >>> I noticed the serial numbers from the log files being very low,
> >>> possibly
> >>> indicating that they were not incremented. This would present the
> >>> exact
> >>> picture I explain at the top of this post.
> >>>
> >>> Kevin Darcy wrote:
> >>>
> >>>> Dave Henderson wrote:
> >>>>
> >>>>
> >>>>> Gang,
> >>>>>
> >>>>> I have three bind servers running. Two at site 1 (one master
> >>>>> and one slave) and the other at site 2. Replication of the zone
> >>>>> file seems to take place, but when updates are made on the
> >>>>> master server, they don't get replicated to the slaves.
> >>>>>
> >>>>>
> >>>> I don't quite understand that sentence. The file is replicating
> >>>> but the
> >>>> changes aren't (???)
> >>>>
> >>>>
> >>>>> Here is a snippet from the log of the master if I delete the
> >>>>> file on a slave server:
> >>>>>
> >>>>> Jul 19 13:55:55 localhost named[8329]: client
> >>>>> 192.168.0.31#32936: transfer of 'esessen.org/IN': AXFR started
> >>>>>
> >>>>> and here it is on the slave server:
> >>>>>
> >>>>> Jul 19 13:55:53 localhost named[7857]: zone esessen.org/IN:
> >>>>> transferred serial 3
> >>>>> Jul 19 13:55:53 localhost named[7857]: transfer of
> >>>>> 'esessen.org/IN' from 192.168.0.11#53: end of transfer
> >>>>> Jul 19 13:55:53 localhost named[7857]: zone esessen.org/IN:
> >>>>> sending notifies (serial 3)
> >>>>>
> >>>>>
> >>>>> That all seems to work ok, but if I make change to a domain, it
> >>>>> doesn't get replicated. There are no records on the master
> >>>>> server indicating a transfer at all. The slave contains:
> >>>>>
> >>>>> Jul 19 13:55:52 localhost named[7857]: zone cliquesoftware.com/
> >>>>> IN: sending notifies (serial 2)
> >>>>>
> >>>>> The actual serial number on the master is 17. Here is the
> >>>>> master log (after a restart):
> >>>>>
> >>>>> Jul 19 11:19:41 localhost named[8329]: zone cliquesoftware.com/
> >>>>> IN: loaded serial 17
> >>>>> Jul 19 11:19:41 localhost named[8329]: zone cliquesoftware.com/
> >>>>> IN: sending notifies (serial 17)
> >>>>>
> >>>>>
> >>>>>
> >>>> How long have you waited and what is the REFRESH setting on the
> >>>> zone? If
> >>>> there's something wrong with the NOTIFY mechanism for this zone,
> >>>> then it
> >>>> could take up to REFRESH time for the changes to replicate.
> >>>>
> >>>> If NOTIFY is broken, then that could be tackled as a separate
> >>>> issue.
> >>>> Better to establish that normal REFRESH-timed replication works
> >>>> before
> >>>> getting into the arcana of NOTIFY.
> >>>>
> >>>>
> >>>>> I am getting the following on the master, but I don't have a
> >>>>> server or client using the following ip address:
> >>>>>
> >>>>> Jul 19 12:11:20 localhost named[8329]: client
> >>>>> 192.168.0.200#2679: updating zone 'digital-pipe.local/IN':
> >>>>> update failed: 'RRset exists (value dependent)' prerequisite
> >>>>> not satisfied (NXRRSET)
> >>>>> Jul 19 12:11:20 localhost named[8329]: client
> >>>>> 192.168.0.200#2682: update 'digital-pipe.local/IN' denied
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>> Probably just a stupid Wintel client that's misconfigured to
> >>>> register
> >>>> its name in DNS.
> >>>>
> >>>>
> >>>> - Kevin
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >> --
> >> Best regards
> >>
> >> Sten Carlsen
> >>
> >> Let HIM who has an empty INBOX send the first mail.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
> >
> >
> >
>
>
>





More information about the bind-users mailing list