libbind / FD_SETSIZE question (on Windows)
Stefan Puiu
stefanpuiu at itcnetworks.ro
Wed Jan 26 14:11:26 UTC 2005
Hi again Danny,
here's the patch to libbind (8.4.6) to fix the problem about res_nsend()
returning -1 with an ENOTSOCK errno on Windows for sockets descriptors
higher than FD_SETSIZE - 1. Sorry for the delay. Does it look ok to you?
I've used 0xFFFFFFFF as highestFD, since I'm not aware of a Windows
#define for the maximum socket descriptor value.
I've also defined INVALID_SOCKET on UNIX just for res_send.c, I haven't
had the time to investigate where such a #define would be best placed.
Danny Mayer wrote:
> At 10:29 AM 12/15/2004, Stefan Puiu wrote:
>
>> Hello,
>>
>> we are using libbind 8.2.7 for an app that needs to do DDNS updates
>> (however, the code I'm talking about is still there in 8.4.5). Our app
>> calls res_nsendsigned(), which in turn calls res_nsend(), for sending
>> some updates. However, on Windows, after some stress testing,
>> res_nsendsigned() would begin to return error, setting the errno
>> (actually, the value returned by WSAGetLastError()) to 10038, which is
>> WSAENOTSOCK. From what I could make out of the source, this error code
>> is set in lib/resolv/res_send.c from the libbind code, in the send_dg()
>> function (this one seems to get called, favouring UDP over TCP as
>> expected). In BIND 8.4.5, the code looks like this (line numbers on the
>> left):
>>
>> 770 if (EXT(statp).nssocks[ns] == -1) {
>> 771 EXT(statp).nssocks[ns] = socket(nsap->sa_family,
>> SOCK_DGRAM, 0);
>> 772 if (EXT(statp).nssocks[ns] > highestFD) {
>> 773 res_nclose(statp);
>> 774 errno = ENOTSOCK;
>> 775 }
>>
>> highestFD is FD_SETSIZE-1, which, on Windows, seems to be equal to
>> 16383. However, I'm having a hard time understanding why somebody would
>> want to limit the range of values returned by socket() that are
>> considered "valid" - unless they'd expect socket() to return the first
>> available free descriptor. On Windows, if you create 20.000 sockets,
>> then close them all, then try creating a new one, the descriptor that is
>> returned to you doesn't seem to be the lowest available, but the next
>> after the one returned by the last socket() call - we've observed this
>> behaviour with a simple test program. So, if at a certain point a socket
>> with fd 16383 is created, then most likely all subsequent calls to
>> send_dg() will fail, because socket() will return a value too big, even
>> though there wouldn't be 16383 descriptors actually open. If FD_SETSIZE
>> is supposed to limit the number of open descriptors, this piece of code
>> doesn't seem to achieve it; it breaks libbind apps running on Windows
>> instead.
>>
>> What I'm asking is: why is FD_SETSIZE needed? And of how much use is
>> this in the aforementioned situation? Is there a way to work around this
>> problem (like using some other function in libbind for sending packets)?
>
>
> This is a much more complicated subject than you may realise. The
> problems
> that you are seeing here in the library on Windows is one of the reasons
> that the socket code was completely rewritten for Windows in the BIND 9
> native code. The other was related to performance.
>
> The problem is related to the fact that the Windows socket() function
> returns
> a 32-bit unsigned integer which can and does take any value in that range
> of numbers. FD_SETSIZE is really only valid for the FDSET for the
> select()
> function and does not relate to the value of the socket fd, it's just the
> maximum number of possible sockets that the select() can handle. (I'm
> simplifying somewhat here). On Unix systems, fd's are basically created
> sequentially. Not so on Windows which could hand toy virtually any number
> whatsoever that isn't currently being used and regularly returns a large
> number. Even if it doesn't I nornally seeing it start with 1000 and go
> up from
> there. It's not practical to create a 4-gigabyte size array just to
> allowing
> indexing into it by fd and it's a VERY sparse array anyway. On Unix
> FD_SETSIZE is used to set up an array where the index is the fd value.
> A better strategy is to use a list to hold the info, but then noone's
> really
> wants to implement all of the necessary changes especially large changes.
>
> For Windows the snippet above already shows a problem since the code
> should use INVALID_SOCKET instead of -1. Create the Macro for Unix
> to be -1 and using INVALID_SOCKET would make this work on all Unix and
> Windows platforms, but that's a simple change. On WIndows highestFD would
> be 2**32-1 and not be related to FD_SETSIZE.
>
> I don't have time to look at the code and it's been a long time since
> I touched
> the BIND 8 code but I suspect that there's a lot more work that would
> need to be done here to get all of the problems fixed.
>
> Danny
>
>
>
-- Attached file included as plaintext by Ecartis --
-- File: res_send.diff
--- lib/resolv/res_send.c 2005-01-26 16:10:29.012568000 +0200
+++ lib/resolv/res_send.c.new 2005-01-26 16:04:04.681664000 +0200
@@ -110,7 +110,16 @@
#define EXT(res) ((res)->_u._ext)
+#ifndef WIN32
+/* UNIX */
+#define INVALID_SOCKET -1
static const int highestFD = FD_SETSIZE - 1;
+#else
+/* On Windows, socket() can return any value between 0 and 2^32-1, so
+ * FD_SETSIZE - 1 is not the highest possible socket descriptor.*/
+static const int highestFD = 0xFFFFFFFF;
+#endif
+
/* Forward. */
@@ -314,7 +323,7 @@
break;
}
- if (EXT(statp).nssocks[ns] == -1)
+ if (EXT(statp).nssocks[ns] == INVALID_SOCKET)
continue;
peerlen = sizeof(peer);
if (getsockname(EXT(statp).nssocks[ns],
@@ -340,7 +349,7 @@
if (EXT(statp).nscount == 0) {
for (ns = 0; ns < statp->nscount; ns++) {
EXT(statp).nstimes[ns] = RES_MAXTIME;
- EXT(statp).nssocks[ns] = -1;
+ EXT(statp).nssocks[ns] = INVALID_SOCKET;
if (!statp->nsaddr_list[ns].sin_family)
continue;
EXT(statp).ext->nsaddrs[ns].sin =
@@ -767,7 +776,7 @@
nsap = get_nsaddr(statp, ns);
nsaplen = get_salen(nsap);
- if (EXT(statp).nssocks[ns] == -1) {
+ if (EXT(statp).nssocks[ns] == INVALID_SOCKET) {
EXT(statp).nssocks[ns] = socket(nsap->sa_family, SOCK_DGRAM, 0);
if (EXT(statp).nssocks[ns] > highestFD) {
res_nclose(statp);
More information about the bind-users
mailing list