AOLserver Tuning, Troubleshooting and Scaling
Adjusting AOLserver for current conditions is very straightforward. It will help by adjusting the following parameters. I got these out of my AOLserver Tuning and Scaling Guide: http://aolserver.com/docs/tuning.adp and the Configuration Reference: http://aolserver.com/docs/config.adp.
For mysterious crashes with no explanation, it's probably due to data corrupted by blown stacks. Multi-threaded applications cannot detect stack overflows and can not increase stack sizes so your data can be corrupted. You can predict this by adding up the size of all your locally-declared data types (do not forget to include the function parameters!!) and adjusting the stacksize parameter. The stacksize can be anything; AOLserver sets it to 65535 bytes on most systems (the Solaris 8 limit of 1 megabyte can be too large). For large apps, especially apps using deep calls to Tcl procs, this default stacksize is usually enough. Remember, too, that each thread can have a stack of 1 megabyte--that's per *thread*. The kernel cannot increase these threads or sense their out-of-bounds condition like it can do with processes so investigate this parameter[1].
ns_section "ns/threads" ns_param stacksize expr 128*1024 ;# stack size per thread (in bytes)
For overloaded machines that work fine and suddenly start thrashing, try adjusting AOLserver's use of system resources. Threads are very flexible and use your system's resources much better than forking processes but every system has its limit!! For example Solaris' maximum number of processes is a function of system memory (up to 2 gigabytes) and its maximum threads is 32768[2].
ns_section "ns/server/${servername}" ns_param maxconnections 100 ;# Max connections to put on queue ns_param maxdropped 0 ;# Shut down if dropping too many conns ns_param maxthreads 20 ;# Tune this to scale your server ns_param minthreads 0 ;# Tune this to scale your server ns_param threadtimeout 120 ;# Idle threads die at this rate
For heavy use of nsv commands, try adjusting the buckets:
ns_section "ns/server/${servername}/tcl" ns_param nsvbuckets 8 ;# No. of buckets to hold nsv's
For cache adjustments (ADPs only), try these:
ns_section "ns/server/${servername}/adp" ns_param cache true ;# In-memory cache of ADP's ns_param cachesize expr 5000*1024
For cache adjustments (HTML, txt, images), try these:
ns_section "ns/server/${servername}/fastpath" ns_param cache true ;# Enable cache for normal URLs ns_param cachemaxentry 8192 ;# Largest file size allowable in cache ns_param cachemaxsize expr 5000*1024 ;# Size of fastpath cache ns_param mmap false ;# Use mmap() for cache
Watch your system! Use pstack(1), swap(1M), vmstat(1) (!note 1! ) , and turn on the sar(1) facility (/var/spool/cron/crontabs/sys). At the minimum your swap usage should be zero. Make ps(1) more useful by aliasing it to this:
ps \ -o user="USER" \ -o pid,ppid,pgid \ -o pri=" PR" \ -o nice="NI" \ -o s="S" \ -o time="CPU_TIME" \ -o pcpu="%CPU" \ -o rss=" CORE_MEM" \ -o vsz="TOTAL_MEM" \ -o pmem="%MEM" \ -o etime \ -o comm \ $1 $2 $3 $4 $5 $6 $7 $8 $9
Regards,
Kris
Footnotes:
- [1] Vahalia, "Unix Internals", p. 61.
- [2] Mauro, McDougall, Solaris Internals, p. 289
Notes:
!note 1! Beware, that runing vmstat on a busy system won't show anything that can
help you, because it eats a lot of cpu, makes a lot of context switches, etc, because it reads some data from /proc for _EVERY_ thread/process running on the machine. I would recommend using procinfo instead, and vmstat only when you have no more than, say, 200-300 processes running on the machine. (this is real world experience :) )