libbind / FD_SETSIZE question (on Windows)

Stefan Puiu stefanpuiu at itcnetworks.ro
Wed Jan 26 14:11:26 UTC 2005


Hi again Danny,

here's the patch to libbind (8.4.6) to fix the problem about res_nsend() 
returning -1 with an ENOTSOCK errno on Windows for sockets descriptors 
higher than FD_SETSIZE - 1. Sorry for the delay. Does it look ok to you? 
I've used 0xFFFFFFFF as highestFD, since I'm not aware of a Windows 
#define for the maximum socket descriptor value.

I've also defined INVALID_SOCKET on UNIX just for res_send.c, I haven't 
had the time to investigate where such a #define would be best placed.

Danny Mayer wrote:

> At 10:29 AM 12/15/2004, Stefan Puiu wrote:
>
>> Hello,
>>
>> we are using libbind 8.2.7 for an app that needs to do DDNS updates
>> (however, the code I'm talking about is still there in 8.4.5). Our app
>> calls res_nsendsigned(), which in turn calls res_nsend(), for sending
>> some updates. However, on Windows, after some stress testing,
>> res_nsendsigned() would begin to return error, setting the errno
>> (actually, the value returned by WSAGetLastError()) to 10038, which is
>> WSAENOTSOCK. From what I could make out of the source, this error code
>> is set in lib/resolv/res_send.c from the libbind code, in the send_dg()
>> function (this one seems to get called, favouring UDP over TCP as
>> expected). In BIND 8.4.5, the code looks like this (line numbers on the
>> left):
>>
>>  770          if (EXT(statp).nssocks[ns] == -1) {
>>  771                  EXT(statp).nssocks[ns] = socket(nsap->sa_family,
>> SOCK_DGRAM, 0);
>>  772                  if (EXT(statp).nssocks[ns] > highestFD) {
>>  773                          res_nclose(statp);
>>  774                          errno = ENOTSOCK;
>>  775                  }
>>
>> highestFD is FD_SETSIZE-1, which, on Windows, seems to be equal to
>> 16383. However, I'm having a hard time understanding why somebody would
>> want to limit the range of values returned by socket() that are
>> considered "valid" - unless they'd expect socket() to return the first
>> available free descriptor. On Windows, if you create 20.000 sockets,
>> then close them all, then try creating a new one, the descriptor that is
>> returned to you doesn't seem to be the lowest available, but the next
>> after the one returned by the last socket() call - we've observed this
>> behaviour with a simple test program. So, if at a certain point a socket
>> with fd 16383 is created, then most likely all subsequent calls to
>> send_dg() will fail, because socket() will return a value too big, even
>> though there wouldn't be 16383 descriptors actually open. If FD_SETSIZE
>> is supposed to limit the number of open descriptors, this piece of code
>> doesn't seem to achieve it; it breaks libbind apps running on Windows
>> instead.
>>
>> What I'm asking is: why is FD_SETSIZE needed? And of how much use is
>> this in the aforementioned situation? Is there a way to work around this
>> problem (like using some other function in libbind for sending packets)?
>
>
> This is a much more complicated subject than you may realise. The 
> problems
> that you are seeing here in the library on Windows is one of the reasons
> that the socket code was completely rewritten for Windows in the BIND 9
> native code. The other was related to performance.
>
> The problem is related to the fact that the Windows socket() function 
> returns
> a 32-bit unsigned integer which can and does take any value in that range
> of numbers. FD_SETSIZE is really only valid for the FDSET for the 
> select()
> function and does not relate to the value of the socket fd, it's just the
> maximum number of possible sockets that the select() can handle. (I'm
> simplifying somewhat here). On Unix systems, fd's are basically created
> sequentially. Not so on Windows which could hand toy virtually any number
> whatsoever that isn't currently being used and regularly returns a large
> number. Even if it doesn't I nornally seeing it start with 1000 and go 
> up from
> there. It's not practical to create a 4-gigabyte size array just to 
> allowing
> indexing into it by fd and it's a VERY sparse array anyway. On Unix
> FD_SETSIZE is used to set up an array where the index is the fd value.
> A better strategy is to use a list to hold the info, but then noone's 
> really
> wants to implement all of the necessary changes especially large changes.
>
> For Windows the snippet above already shows a problem since the code
> should use INVALID_SOCKET instead of -1. Create the Macro for Unix
> to be -1 and using INVALID_SOCKET would make this work on all Unix and
> Windows platforms, but that's a simple change. On WIndows highestFD would
> be 2**32-1 and not be related to FD_SETSIZE.
>
> I don't have time to look at the code and it's been a long time since 
> I touched
> the BIND 8 code but I suspect that there's a lot more work that would
> need to be done here to get all of the problems fixed.
>
> Danny
>
>
>



-- Attached file included as plaintext by Ecartis --
-- File: res_send.diff

--- lib/resolv/res_send.c	2005-01-26 16:10:29.012568000 +0200
+++ lib/resolv/res_send.c.new	2005-01-26 16:04:04.681664000 +0200
@@ -110,7 +110,16 @@
 
 #define EXT(res) ((res)->_u._ext)
 
+#ifndef WIN32
+/* UNIX */
+#define INVALID_SOCKET -1
 static const int highestFD = FD_SETSIZE - 1;
+#else
+/* On Windows, socket() can return any value between 0 and 2^32-1, so
+ * FD_SETSIZE - 1 is not the highest possible socket descriptor.*/
+static const int highestFD = 0xFFFFFFFF;
+#endif
+
 
 /* Forward. */
 
@@ -314,7 +323,7 @@
 					break;
 				}
 
-				if (EXT(statp).nssocks[ns] == -1)
+				if (EXT(statp).nssocks[ns] == INVALID_SOCKET)
 					continue;
 				peerlen = sizeof(peer);
 				if (getsockname(EXT(statp).nssocks[ns],
@@ -340,7 +349,7 @@
 	if (EXT(statp).nscount == 0) {
 		for (ns = 0; ns < statp->nscount; ns++) {
 			EXT(statp).nstimes[ns] = RES_MAXTIME;
-			EXT(statp).nssocks[ns] = -1;
+			EXT(statp).nssocks[ns] = INVALID_SOCKET;
 			if (!statp->nsaddr_list[ns].sin_family)
 				continue;
 			EXT(statp).ext->nsaddrs[ns].sin =
@@ -767,7 +776,7 @@
 
 	nsap = get_nsaddr(statp, ns);
 	nsaplen = get_salen(nsap);
-	if (EXT(statp).nssocks[ns] == -1) {
+	if (EXT(statp).nssocks[ns] == INVALID_SOCKET) {
 		EXT(statp).nssocks[ns] = socket(nsap->sa_family, SOCK_DGRAM, 0);
 		if (EXT(statp).nssocks[ns] > highestFD) {
 			res_nclose(statp);




More information about the bind-users mailing list