Best Practice to Restart dhcpd after omapi Shutdown

Thu Apr 27 16:47:23 UTC 2006

omshell does seem to be a good clean way to shut down dhcpd.  Usually,
one needs to restart it as quickly as possible immediately after
the shutdown.  When using the kill `cat /var/run/dhcpd.pid` mechanism,
dhcpd is definitely dead, but it never gets a chance to put the
failover peer in to "partner-down" mode.  If one uses omshell for the
shutdown, there seems to be no good way to tell afterward when it is
safe to proceed to the restart of dhcpd.

	What I notice is that sometimes, dhcpd writes a pair of
messages that look like:

Apr 26 07:36:48 landlord dhcpd: Disabling output on BPF/fxp0/00:xx-
Apr 26 07:36:48 landlord dhcpd: Disabling input on BPF/fxp0/00:xx-

	In other words, dhcpd is leaving the stage and bringing down
the curtain.

	Unfortunately, this doesn't always happen this nicely.
Sometimes, there is no "Disabling " message.  One just sees the next
dhcpd startup banner which makes me think that the omapi connections
never quite got properly finished before the new instance of dhcpd
started.  In some cases such as bind, this is a disaster and your new
bind instance won't restart correctly.

	The documentation for dhcpd says that it can take up to 25
seconds for omapi to finish.  There is, however no positive feedback
that one can use after giving the omapi update command to know when
the coast is clear.  I did get some improvement when I put the
following sequence in to the shell script that calls the expect script
containing the omapi commands to kill dhcpd.  This shell script does
the shutdown and then looks for the PID to die and loops until it
does.  Here is the fragment which does that:

#Kill and restart dhcpd.
if test -s /var/run/dhcpd.pid; then
#What's the current dhcpd process number?
DHCPPROC=`cat /var/run/dhcpd.pid`
#Save that for later.
#Send the omapi commands to kill dhcpd.
/usr/local/etc/shutdown_dhcp.exp
#Start looping and testing for the absence of the PID.
while ps -p $DHCPPROC >/dev/null; do
sleep .5
done
fi
#Always do this.
/usr/local/sbin/dhcpd -q
#dhcpd should now be running.

	That seemed to improve things somewhat, but I still get the
random case where it didn't appear to end completely.  A further
slight improvement came when I put a forced 1-second sleep after the
loop which always guarantees at least 1 second of dead time plus
however long the test loop hung before the process ID vanished.

	Is there a condition I can look for that truly means dhcpd is
down?  That would be much better than being right most, but not all of
the time?

	Thank you.

Martin McCormick WB5AGZ  Stillwater, OK 
Systems Engineer
OSU Information Technology Department Network Operations Group