FW: [unisog] Mac OS X 10.4.x "DHCP client sometimes remains BOUND aftersending DHCPDISCOVER" bug

Frank Bulk frnkblk at iname.com
Wed Jan 31 05:30:30 UTC 2007


FYI, as this seems pertinent.

Frank 

-----Original Message-----
From: unisog-bounces at lists.dshield.org
[mailto:unisog-bounces at lists.dshield.org] On Behalf Of Irwin Tillman
Sent: Tuesday, January 30, 2007 3:37 PM
To: unisog at lists.dshield.org
Subject: [unisog] Mac OS X 10.4.x "DHCP client sometimes remains BOUND
aftersending DHCPDISCOVER" bug

At Princeton we've been seeing IP address conflicts due to an issue in Mac
OS X 10.4.x.
As I suspect other schools with high DHCP lease churn rates from Mac OS X
10.4.x
clients may experience the same bug, I thought I'd post the details for you.

----

Since September 2005 (yes, 2005) I've been seeing a DHCP client issue from
Mac OS 10.4.x systems at Princeton University, where I maintain DHCP
service.
I call it the Mac OS X 10.4.x "DHCP client sometimes remains BOUND after
sending 
DHCPDISCOVER" bug.

I reported it to Apple (Apple Bug Reporter Problem ID 4904550);
Apple's examined it and confirmed that Mac OS X does indeed behave
this way, and that they believe this behavior is correct (is consistent
with RFC 2131).   I believe the behavior violates RFC 2131.
If the behavior is part of an implementation of 'Detection of Network
Attachment 
in IPv4 (DNAv4)', then it also violates RFC 4436.

Depending on how your DHCP server operates, this behavior may result
in Macs using IP addresses no longer leased to them, interfering
with network service to other devices.  (This is dependant on 
the DHCP server behavior; some DHCP servers can tolerate a client
that malfunctions in this way.)

If you don't monitor for this particular problem (i.e. that your DHCP
clients
are using IP addresses no longer leased to them), you may not be
aware of the problem.  You might hear sporadic complaints from victims
when they are leased an IP address that is "stolen" by another device
(i.e. the malfunctioning Mac OS X 10.4.x device).  But given that the victim
may work around the problem by requesting a new DHCP lease, you may 
not receive many complaints from your customers; they may just chalk it up
to things being flaky.

If you have other network equipment monitoring DHCP traffic to infer the IP
addresses leased to clients, this behavior may also result in that
equipment's
conclusions differing from the clients. (Again, this depends on 
whether that equipment can tolerate a client that malfunctions in this way.)

Apple currently believes that the behavior of the Mac OS X 10.4.x client
is correct; I was not able to convince them that the behavior is incorrect.
As a result, if you are affected by this problem, either you may choose to
endure
the problem, or to replace the facilities (e.g. DHCP servers or
DHCP-snooping equipment)
with others that are tolerant of the incorrect Mac OS X 10.4.x behavior.

Below is a (lengthy) technical description.

Irwin Tillman
OIT Network Systems / Princeton University

--

Mac OS X 10.4.x "DHCP client sometimes remains BOUND after sending
DHCPDISCOVER" bug
January 30 2007

* Technical Overview:

Some time after obtaining a DHCP lease (entering the DHCP BOUND state), the
client sends one or more DHCPDISCOVER packet. This implies the client has
returned to the DHCP INIT state, relinquishing the old DHCP lease. In some
cases, the client ignores all offers sent in response to the DHCPDISCOVER
packet(s), or all those offers never reach the client (e.g. are dropped by
the
network). However, instead of remaining in the DHCP INIT state, the client
continues to act as if the old DHCP lease is still in the BOUND state. It
keeps
using the IP address from the old lease (even trying to RENEW and later
REBIND
the old lease).

Because other DHCP clients may be leased the IP address after the first
client
relinquishes its lease on the address, the first client's continued use of
that
IP address interferes with service to those other clients.

I can positively confirm that the problem began no later than September
2005. At
that time, Mac OS X 10.4.2 was the latest version of the OS available. Prior
to
then, I did not see the Mac OS X clients here exhibit this problem. Based on
that, I believe the issue was introduced into Mac OS X in version 10.4.2 or
earlier. Given that 10.4 was released in late April 2005, and the usage at
our
institution would tend to make it difficult to notice over the Summer (when
most
of our customers are away), I can imagine the problem may have been
introduced
as far back as version 10.4 or 10.4.1.

At Princeton, the problem has slowly grown from a rare occurence to a
frequent
problem. That's due to the growth in number of Mac OS X 10.4.x systems at
our
site, and due to the higher DHCP lease churn rate for each client
(associated
with the increasing use of wireless laptops that connect and disconnect
often).
At first I detected only a few incidents per month throughout our entire
institution; by now I generally see several each day.

In many of these cases, our support staff have examined the malfunctioning
Macs,
and found no apparent problems. The devices appeared to be properly
configured
to use DHCP, with no special circumstances. (E.g. there was no second device
forging the
first one's hardware address or DHCP Client Identifier, no VM software
running a
separate DHCP client instance, no use of Apple's (or a third party) NAT
software. There was nothing to indicate that there was a second DHCP client
instance running using the same DHCP Client Identifier on the same network.
It really
appeared to be just simple Mac OS X DHCP clients, running current (at the
time) versions of 10.4.x.)

After exhibiting the malfunction, the device may go weeks or months before
exhibiting the malfunction (stealing an IP address) again...or it may happen
again a few hours later. I know of no way to force any one device to
reproduce
the problem; I only am able to detect the problem the day after the fact,
when I
see "stolen IP address" problems by reviewing daily logs of unexpected DHCP
server transactions and comparing them to IP usage data drawn from router
ARP
cache snapshots.

--

* The Packets

In more detail, the DHCP packets (and IP usage) I see from the
malfunctioning 
Mac OS X clients is:

1) The DHCP client obtains a lease (reaches the DHCP BOUND state).

2) The client may renew the lease 0 or more times. These renewals might
happen
   at the expected time T1, or might happen within seconds of the client
   reaching the BOUND state.
 
3) Before the lease is due to expire, the client broadcasts a DHCPDISCOVER
   packet. 

   Since the client is still attached to the same network, and is still
   using the same DHCP Client Identifier, this implies the client has
entered the
   DHCP INIT state, implicitly relinquishing the old lease.

   Sometimes this is just a few seconds after the client entered
   the BOUND state; sometimes it is minutes or hours later.
   It seems to be before time T1, when the lease would have reached the time
to renew it.

4) One or more DHCP server(s) respond to the client with DHCPOFFER(s).

5) The client does not accept any of the offers; it sends no DHCPREQUEST.
   I.e. it never proceeds to the DHCP SELECTING state.

6) Optionally, the client retransmits the DHCPDISCOVER packet several times
over
   the next minute. If so, the DHCP server(s) respond with DHCPOFFER(s),
which
   the client again ignores.

7) In almost all cases, the client continues to use the IP address from the
   lease it relinquished earlier.  That is, it continues to answer IP ARP
   requests for the IP address that was part of the relinquished lease.
   We can see this in snapshots taken of our IP router ARP caches.
   It continues to transmit IP packets with this value as the IP source
address.

8) Optionally, at the time the relinquished lease would have reached time
T1, the
   client tries to RENEW the relinquished lease. The DHCP server that had
   granted the relinquished lease responds with a DHCPNAK. The client
continues
   to use the IP address, and may try to renew the relinquished lease
additional
   times until time T2.

   Sometimes these DHCPREQUESTs are also malformed. Specifically, the DHCP
'Server
   IP Address Option' is 0, the DHCP 'Requested IP Address Option' is 0, and
the
   'ciaddr' field is 0. (There is no case where a DHCP client should send a
   DHCPREQUEST packet with that set of characteristics.)

   Throughout this time (from the time the relinquished lease would have
reached
   time T1 until it would have reached time T2), the client may also
sometimes send
   DHCPDISCOVER packets, receive DHCPOFFERs, and ignore the offers.

9) Optionally, at the time the relinquished lease would have reached time
T2, the
   client tries to REBIND the relinquished lease. The DHCP server that had
   granted the relinquished lease responds with a DHCPNAK. Other DHCP
servers do
   not respond. The client continues to use the IP address, and may try to
   rebind the relinquished lease additional times until the time the
   relinquished lease was to have expired.

   Sometimes, these DHCPREQUESTs are also malformed. Specifically, the DHCP
'Server
   IP Address Option' is 0, the DHCP 'Requested IP Address Option' is 0, and
the
   'ciaddr' field is 0. (There is no case where a DHCP client should send a
   DHCPREQUEST packet with that set of characteristics.)

   Throughout this time (from the time the relinquished lease would have
reached
   time T2 until it would have expired), the client may also sometimes send
   DHCPDISCOVER packets, receive DHCPOFFERs, and ignore the offers.

10) The problem ends in one of these ways:

    If the client is offline (e.g. disconnected from the network) at the
time the
    relinquished lease was due to expire, when it next reconnects it starts
in the
    DHCP INIT state and works properly. (It sends DHCPDISCOVER(s), receives
    DHCPOFFER(s), proceeds to SELECTING and BOUND, and uses the IP address
obtained
    via its new lease.)
    
    Alternatively, if the client is connected to the network at the time the
    relinquished lease was due to expire, at that time the client stops
using the IP
    address from the relinquished lease, enters the DHCP INIT state, and
works
    properly. (It sends DHCPDISCOVER(s), receives DHCPOFFER(s), proceeds to
    SELECTING and BOUND, and uses the IP address obtained via its new
lease.)



-------

* Apple's Take

Apple has indicated to me that the Mac OS X DHCP client may indeed
transition directly from
the DHCP INIT state to the DHCP BOUND state.  

The situation they describe is:

a) A client is in the DHCP BOUND state.

b) The DHCP client enters the INIT-REBOOT state, e.g. as a result of
a sleep/wake cycle or link state change.

c) The DHCP client sends a DHCPREQUEST.

d) The DHCP client receives neither a DHCPACK or DHCPNAK.

(I note that in our case, the DHCPREQUEST may indeed have reached the DHCP
server and
the DHCP server sent a DHCPACK.  Presumably this never reached the client.) 

e) Their DHCP client chooses to stop using the old lease, although there
is still time remaining on the lease.

(I note that RFC 2131 allows the client to continue using the old lease if
it wishes,
until the old lease expires.)
Instead, their client chooses to return to the DHCP INIT state.

f) The DHCP client sends a DHCPDISCOVER.

g) The DHCP client does not receive a DHCPOFFER.

(I note that in our case, the DHCPDISCOVER did indeed reach the DHCP
servers,
and the DHCP servers send DHCPOFFERs to the client.  Presumably none of the
multiple DHCPOFFERs reached the client (e.g. all are dropped by the
network),
or the client has gone selectively deaf to just these offers.)

h) The DHCP client sends an ARP request to the IP router that was its
default
gateway in the old lease.  The router responds, and the client receives the
response.

i) The DHCP client goes back to the DHCP BOUND state and resumes using the
old lease.  It will enter RENEWING and REBINDING state as usual.



I see two problems with the client behavior Apple 's described:

* It doesn't explain why the DHCPREQUEST packets I observe in step 8 and 9 
are sometimes malformed.

* More importantly, the client transitions directly from the DHCP INIT state
to the BOUND state.  I don't believe that's permitted in DHCP.

Specifically, Figure 5 in RFC 2131 contains the state transition diagram
for DHCP clients.  (There have been some changes due to later RFCs, but
none that are directly relevant to this matter.)

It makes it clear that a client the only time a DHCP client may send a
DHCPDISCOVER
is when it is in the DHCP INIT state, and that there is no way for a client
in the DHCP INIT state to get to the DHCP BOUND state without the server
granting
it a lease.  There's no provision in the RFC for the client attached to a
single subnet
to "go back" to using an old lease on that subnet.   Once a client
identified by a unique (client identifier, subnet)
tuple sends a DHCPDISCOVER, any old lease identified by that (client
identifier, subnet) tuple
is no longer valid.  The client's abandoned any old lease identified by that
tuple.

--

* DNAv4

The client behavior may be Apple's implementation of  "Detection
of Network Attachment in IPv4 DNAv4)" (RFC 4436).

However, Apple's client is not behaving the way that RFC describes.

Specifically, RFC 4436 (section 2.2, paragraph 1) states that a client
in the situation we're seeing (the client has an "operable routable IPv4
address")
should broadcast a DHCPREQUEST message from the INIT-REBOOT state.
(It says to broadcast a DHCPDISCOVER from the INIT state if the client
doesn't have an operable routable IPv4 address on any network, but that's
not
the case here.)

So if what Apple's doing is an implementation of DNAv4, it's not doing it
right.

--

* Summary

Apple's stance is that sending a DHCP client can indeed go back to resume
using the old lease;
that sending a DHCPDISCOVER doesn't imply that client has abandoned that
lease.

I believe that's not right; it appears to me that it is not permitted by
the state transition diagram (Figure 5) in RFC 2131.  And if what they're
doing is DNAv4, RFC 4436 also makes it clear that what their client is
doing is wrong.

So I believe that the behavior introduced in the Mac OS X 10.4.x DHCP client
is not correct.

If your DHCP server makes use of the DHCPDISCOVER to decide that the
old lease identified by (client identifier, subnet) has been abandoned by
the client,
this client behavior will cause a problem.  That's because as far as the
server
is concerned, the old lease has been abandoned by the client, but the client
proceeds to use that IP address.    Your server may lease that IP address
to another device.  

And if you have equipment that monitors DHCP traffic
to learn which IP addresses are leased to clients, the conclusions
reached by that equipment will not always match that of Mac OS X 10.4.x
clients.

--






_______________________________________________
unisog mailing list
unisog at lists.dshield.org
https://lists.sans.org/mailman/listinfo/unisog



More information about the dhcp-users mailing list