Failover. Reesetablishing contact after communications-interrupted

Martin Ericsson martin.s.ericsson at gmail.com
Wed May 30 06:22:29 UTC 2007


Hi.
Thanks for your answer. It's always nice to know that you're not the
only one experiencing a problem : )

After the last communication failure I have started to do some network
tracing, but I will not see the result until the servers go into
"Communications interrupted" again. Is there anything in particular I
can look for, or is it enough to see if there's some traffic at all
between the two servers on ports 519 and 520 after the communication
failure?

We actually had the same problem with a prior version. We used version
 3.0.1-2, as it was standard in Debian Sarge, but upgraded to 3.0.5
hoping that this would solve the problem.

Should I file a but about this issue?

2007/5/29, Glenn Satchell <Glenn.Satchell at uniq.com.au>:
>
> >Date: Tue, 29 May 2007 08:46:21 +0200
> >From: "Martin Ericsson" <martin.s.ericsson at gmail.com>
> >To: dhcp-users at isc.org
> >Subject: Failover. Reesetablishing contact after communications-interrupted
> >
> >Hi.
> >I have experienced some problems with DHCP failover. The two servers
> >occasionally (once every week or so) loses contact. I guess it is a
> >network error. This is what happens:
> >
> >From syslog on server1:
> >/var/log/syslog
> >May 23 16:16:22 nadrdir01 dhcpd: failover: listener: no matching state
> >May 23 16:16:57 nadrdir01 dhcpd: timeout waiting for failover peer
> dhcp-failover
> >May 23 16:16:57 nadrdir01 dhcpd: peer dhcp-failover: disconnected
> >May 23 16:16:57 nadrdir01 dhcpd: failover peer dhcp-failover: I move
> >from normal to communications-interrupted
> >
> >From syslog on server2:
> >failover peer "dhcp-failover" state {
> >  my state communications-interrupted at 3 2007/05/23 14:16:17;
> >  partner state normal at 2 2007/05/15 09:02:47;
> >  mclt 600;
> >}
> >
> >
> >Both servers go into the "Communications-interrupted" state, but still
> >think the other server is in "Normal" state.
>
> The "partner state normal" is only set when the clients first conenct
> and is then never updated, so you can't rely on it. I asked about this
> once before.
>
> If you only restart one server does it come back? This has been my
> experience.
>
> If you search the archives you may find an earlier post from me with
> the same problem with 3.0.4 (I think). I am sure it used to reconnect
> properly on prior releases. We have had the same problem where it
> wouldn't automatically reconnect when doing disaster tests and breaking
> the network between our two dhcp servers for a few hours.
>
> Is there any traffic if you trace network traffic on ports 519 and 520?
>
> I run 3.1.0a3 at home and that reconnects automatically, so when the
> stable release comes out that may be an option for you.
>
> regards,
> -glenn
>
>


More information about the dhcp-users mailing list