Memory leak in 8.2.1 running under Solaris 2.7?

Mark_Andrews at isc.org Mark_Andrews at isc.org
Wed Aug 4 01:03:11 UTC 1999


	Tim,
	     If named is dieing of resource starvation at 20 MB it
	is being run on a seriously under configured machine for
	the load being put on it.  Being a caching server for a
	uni puts a lot of memory usage load on a dns server.

	The amount of memory named can take from the system is
	controlled by "options { datasize #; };".  Just don't let
	datasize exceed the amount of real memory in the machine.
	Also you need enough virtual memory to support a several
	images simultaniously as if forks prior to execing
	{in.}named-xfer.

	While you say named's memory usage grows without bounds it
	will normally stabalise after 7 days as this is the maximun
	cache period unless you have turned on host-statistics in
	the options block.

	Mark

> Hi,
> 
> I have installed bind 8.2.1 on our main dns server running solaris 2.7
> 
> I can see no problems with the operation of the in.named process - it
> seems to answer all the queries correctly, however, the memory image
> size of this process continues to grow to the point where it dumps core
> and stops.
> 
> At present I have worked around the problem, by using an hourly cron job
> that kills and restarts named if its memory size is > 20 MBytes -
> however this isn't the nicest of solutions.
> 
> Has anyone else noticed similar problems ?  Does anyone have 8.2.1
> working successfully on solaris 2.7 ?  Would I be best advised to: - try
> an earlier version ? - try a re-compile with some different options ? -
> wait for an upcoming upgrade ?
> 
> The server is getting pretty high usage as the main dns server for our
> university.  It seems to get take a little more than a day to achieve
> 20MByte memory image size
> 
> I also have bind_8.2.1 running on 2 backup name servers also running
> solaris 2.7- their process sizes also grow without bound, but more
> slowly, since they receive fewer requests.
> 
> Here are some file sizes on the main server to give you an idea of the
> number of records in our zones
> bash-2.03$ wc zone/*
>     2433    8071  120111 zone/1
>      175     635    6498 zone/csu.edu.au
>      126     347    5227 zone/dubbo.csu.edu.au
>       79     280    2323 zone/dubbo.csu.edu.au.preatm
>      124     362    5764 zone/goulb.csu.edu.au
>      126     515    3715 zone/man.csu.edu.au
>     2461    8162  121459 zone/mit.csu.edu.au
>       23      77    1000 zone/netcommplete.com.au
>     2475    8181  121668 zone/new.mit.csu.edu.au
>     1378    5200   51579 zone/resba.csu.edu.au
>      109     380    3849 zone/syd.csu.edu.au
>       19      56     462 zone/tasrural.com.au
>     9528   32266  443655 total
> bash-2.03$ wc rev/*
>      148     500    6317 rev/1
>       24      57     838 rev/11.166.137.in-addr.arpa
>       20      61     755 rev/12.166.137.in-addr.arpa
>      200     704    8797 rev/16.166.137.in-addr.arpa
>       53     195    1552 rev/166.137.in-addr.arpa
>      162     572    7133 rev/17.166.137.in-addr.arpa
>      102     324    4527 rev/18.166.137.in-addr.arpa
>      124     402    5441 rev/19.166.137.in-addr.arpa
>      139     476    6146 rev/20.166.137.in-addr.arpa
>       46     109    2007 rev/200.166.137.in-addr.arpa
>       58     176    2519 rev/201.166.137.in-addr.arpa
>       41     113    1779 rev/202.166.137.in-addr.arpa
>       46     120    1996 rev/203.166.137.in-addr.arpa
>       32      82    1318 rev/204.166.137.in-addr.arpa
>       49     145    2136 rev/205.166.137.in-addr.arpa
>       37      79    1493 rev/206.166.137.in-addr.arpa
>       90     292    3879 rev/21.166.137.in-addr.arpa
>       25      72     993 rev/210.166.137.in-addr.arpa
>      125     427    5634 rev/22.166.137.in-addr.arpa
>       98     357    3866 rev/221.22.203.in-addr.arpa
>       78     233    3471 rev/23.166.137.in-addr.arpa
>       53     138    2335 rev/235.166.137.in-addr.arpa
>       69     193    3275 rev/236.166.137.in-addr.arpa
>       64     175    2824 rev/24.166.137.in-addr.arpa
>       11      33     286 rev/244.166.137.in-addr.arpa
>       38      82    1647 rev/248.166.137.in-addr.arpa
>       37     107    1532 rev/27.166.137.in-addr.arpa
>       74     228    3225 rev/28.166.137.in-addr.arpa
>       48     131    2051 rev/29.166.137.in-addr.arpa
>       38      88    1630 rev/30.166.137.in-addr.arpa
>       37     107    1532 rev/31.166.137.in-addr.arpa
>      152     510    6732 rev/32.166.137.in-addr.arpa
>      106     358    4595 rev/33.166.137.in-addr.arpa
>      117     391    5139 rev/34.166.137.in-addr.arpa
>      148     500    6317 rev/35.166.137.in-addr.arpa
>      191     683    8566 rev/36.166.137.in-addr.arpa
>       53     177    2388 rev/37.166.137.in-addr.arpa
>       65     190    2917 rev/38.166.137.in-addr.arpa
>       93     318    4135 rev/39.166.137.in-addr.arpa
>      228     862   10131 rev/40.166.137.in-addr.arpa
>      230     879   10286 rev/41.166.137.in-addr.arpa
>      224     849    9364 rev/44.166.137.in-addr.arpa
>      226     862    9452 rev/45.166.137.in-addr.arpa
>      229     865   10555 rev/48.166.137.in-addr.arpa
>      231     882   10726 rev/49.166.137.in-addr.arpa
>       42     121    1757 rev/52.166.137.in-addr.arpa
>       82     258    3612 rev/56.166.137.in-addr.arpa
>       42      97    1783 rev/60.166.137.in-addr.arpa
>       58     205    1930 rev/72.166.137.in-addr.arpa
>       60     193    2563 rev/8.166.137.in-addr.arpa
>        0       3     512 rev/old
>     4743   15981  206394 total
> 
> The hourly results of my cron job (ps -fael | egrep 'named|SZ')  show...
> 
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 23732 23731  0  71 20 f61ef328    227 f62535b8 05:05:01
> ?        0:00 egrep named|SZ
>  8 S     root 13352     1  1  41 20 f66d5758  11243 f65e2bc2 12:03:10
> ?       27:38 /usr/sbin/in.named
>  8 S     root 23731   166  1  71 20 f61ede28    241 f61ede94 05:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  8 S     root 23696 13352  0  61 20 f6135320    441 f6791814 05:03:04
> ?        0:00 /usr/sbin/named-xfer -z 220.22.203.
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 24500   166  1  71 20 f673c570    241 f673c5dc 06:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  8 S     root 13352     1  2  41 20 f66d5758  11419 f65e2bc2 12:03:10
> ?       28:10 /usr/sbin/in.named
>  8 S     root 24501 24500  0  47 20 f66d5e58    229 f62535b8 06:05:00
> ?        0:00 egrep named|SZ
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 13352     1  1  41 20 f66d5758  11571 f65e2bc2 12:03:10
> ?       28:39 /usr/sbin/in.named
>  8 R     root 25128 25127  0  81 20 f61ede28    227          07:05:00
> ?        0:00 egrep named|SZ
>  8 S     root 25127   166  1  81 20 f61ed728    241 f61ed794 07:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 13352     1  2  41 20 f66d5758  11799 f65e2bc2 12:03:10
> ?       29:17 /usr/sbin/in.named
>  8 S     root 26034 26033  0  61 20 f673b070    227 f62535b8 08:05:00
> ?        0:00 egrep named|SZ
>  8 S     root 26033   166  1  61 20 f673b770    241 f673b7dc 08:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 13352     1  5  53 20 f66d5758  12279 f65e2bc2 12:03:10
> ?       30:45 /usr/sbin/in.named
>  8 S     root 28478 28475  0  47 20 f61ede28    227 f6253bf8 09:05:00
> ?        0:00 egrep named|SZ
>  8 S     root 28475   166  1  47 20 f66d8858    241 f66d88c4 09:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
> 
> The output of this hourly cron job showed the following leading up to
> and following my last core dump:
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 14573 14572  0  80 20 f632da30    227 f623d978 21:05:01
> ?        0:00 egrep named|SZ
>  8 S     root  2779     1  1  41 20 f632e130  36779 f6347c02   Jul 29
> ?       102:37 /usr/sbin/in.named
>  8 S     root 14572   182  1  80 20 f674a080    241 f674a0ec 21:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 15651 15649  0  71 20 f67fa3a8    227 f623d978 22:05:00
> ?        0:00 egrep named|SZ
>  8 S     root 15649   182  1  71 20 f67f9ca8    241 f67f9d14 22:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  8 S     root  2779     1  2  45 20 f632e130  36979 f6347c02   Jul 29
> ?       103:17 /usr/sbin/in.named
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 17338   182  1  71 20 f5ed4c28    241 f5ed4c94 23:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  8 S     root 17339 17338  0  71 20 f68121b8    227 f623d978 23:05:00
> ?        0:00 egrep named|SZ
>  8 S     root  2779     1  1  41 20 f632e130  37347 f6347c02   Jul 29
> ?       104:10 /usr/sbin/in.named
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 17993  2779  0  69 20 f598e308    441 f651fe4c 00:03:46
> ?        0:00 /usr/sbin/named-xfer -z 87.166.137.
>  8 S     root 17994   182  1  61 20 f68113b8    241 f6811424 00:05:01
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  8 S     root 17995 17994  0  61 20 f680ccb0    227 f623d978 00:05:01
> ?        0:00 egrep named|SZ
>  8 R     root  2779     1 98  99 20 f632e130  44675            Jul 29
> ?       111:33 /usr/sbin/in.named
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 18129   182  1  47 20 f68113b8    241 f6811424 01:05:02
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  8 S     root 18130 18129  0  47 20 f5ed4528    227 f623d978 01:05:03
> ?        0:00 egrep named|SZ
>  8 R     root  2779     1 96  69 20 f632e130  64103            Jul 29
> ?       129:22 /usr/sbin/in.named
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 18226   182  1  61 20 f5ed3728    241 f5ed3794 02:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
>  8 S     root 18227 18226  0  61 20 f5ed5a28    227 f623d978 02:05:00
> ?        0:00 egrep named|SZ
>  F S      UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN    STIME
> TTY      TIME CMD
>  8 S     root 18432 18431  0  61 20 f632e130    227 f623d978 03:05:00
> ?        0:00 egrep named|SZ
>  8 S     root 18431   182  1  61 20 f5ed3028    241 f5ed3094 03:05:00
> ?        0:00 sh -c ps -fael | egrep 'named|SZ' >
> Yes, those last few hours with no named process running were a real
> headache!!
> 
> I would appreciate any advice or similar experiences that anyone can
> pass on to me.
> 
> Many Thanks
> 
> Tim
> 
> 
> --
> =============================================================================
> =
> Please note my new phone number......
> Tim Rayner - Networks Officer         | Email : trayner at csu.edu.au
>              Murray Campus            |  Mail : P.O. Box 789, Albury,NSW, 264
> 0
>              Charles Sturt University | Phone : (02) 6051 9886
>                                       |   Fax : (02) 6051 9919
> =============================================================================
> =
> 
> 
> 
> 
> 
--
Mark Andrews, Internet Software Consortium
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: marka at isc.org


More information about the bind-users mailing list