shutting down due to TCP receive error

Tue Mar 8 01:33:11 UTC 2005

Clade <cdelia at melitacable.com> wrote:
> I work at an ISP. We have two name servers, a primary and a
> secondary name server. We also have a caching server. The primary
> name server and the caching server sun operate with Solaris 9 and
> they run BIND 9.2.3 . The secondary name server runs Bind 9.2.2 and
> Solaris 8.  Lately, I have been noting error messages like

> named[135]: [ID 873579 daemon.error] dispatch 41d110: shutting down
> due to TCP receive error: connection reset

> on all the three servers. Can someone please help? Although the
> connection is said to be reset, the name service does not appear to
> be effected in any way. On the 2 servers running BIND 9.2.3, BIND is
> running in chroot mode where a directory /var/named has been created
> and all BIND files and services are operating from with this
> directory.

Do you run TCP between the three servers?  If so, Solaris may still
have text in the RST segment that explains the reason for the reset -
a tcpdump trace and a bit of hex decode would be in order, something
that should not be too difficult for an ISP :)

Otherwise, there can be many reasons for a RST - the remote may have
given-up on trying to transmit data to your system(s).  It could be a
Windows or other application that is (ab)using the SO_LINGER option to
use abortive closes.  It could be that your system sent data to the
remote after the remote had called close().

rick jones
-- 
The glass is neither half-empty nor half-full. The glass has a leak.
The real question is "Can it be patched?"
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to raj in cup.hp.com  but NOT BOTH...