Difference between revisions of "AOLserver Tuning, Troubleshooting and Scaling"
Line 10: | Line 10: | ||
locally-declared data types (do not forget to include the function | locally-declared data types (do not forget to include the function | ||
parameters!!) and adjusting the stacksize parameter. The stacksize can | parameters!!) and adjusting the stacksize parameter. The stacksize can | ||
− | be anything; AOLserver sets it to 65535 bytes on most systems | + | be anything; AOLserver sets it to 65535 bytes on most systems. For large apps, |
− | |||
especially apps using deep calls to Tcl procs, this default stacksize is | especially apps using deep calls to Tcl procs, this default stacksize is | ||
− | usually enough. | + | for plain aolserver usually enough. If the stack size is to small, the |
+ | server crashes. For 64bit systems with e.g. OpenACS running, one has | ||
+ | to increase the stack size as done below. Remember, too, that each | ||
+ | thread can have a stack of 1 | ||
megabyte--that's per *thread*. The kernel cannot increase these threads | megabyte--that's per *thread*. The kernel cannot increase these threads | ||
or sense their out-of-bounds condition like it can do with processes so | or sense their out-of-bounds condition like it can do with processes so |
Latest revision as of 08:55, 30 January 2009
Adjusting AOLserver for current conditions is very straightforward. It will help by adjusting the following parameters. I got these out of my AOLserver Tuning and Scaling Guide: http://aolserver.com/docs/tuning.adp and the Configuration Reference: http://aolserver.com/docs/config.adp.
For mysterious crashes with no explanation, it's probably due to data corrupted by blown stacks. Multi-threaded applications cannot detect stack overflows and can not increase stack sizes so your data can be corrupted. You can predict this by adding up the size of all your locally-declared data types (do not forget to include the function parameters!!) and adjusting the stacksize parameter. The stacksize can be anything; AOLserver sets it to 65535 bytes on most systems. For large apps, especially apps using deep calls to Tcl procs, this default stacksize is for plain aolserver usually enough. If the stack size is to small, the server crashes. For 64bit systems with e.g. OpenACS running, one has to increase the stack size as done below. Remember, too, that each thread can have a stack of 1 megabyte--that's per *thread*. The kernel cannot increase these threads or sense their out-of-bounds condition like it can do with processes so investigate this parameter[1].
ns_section "ns/threads" ns_param stacksize 2048000 ;# stack size per thread (in bytes)
For overloaded machines that work fine and suddenly start thrashing, try adjusting AOLserver's use of system resources. Threads are very flexible and use your system's resources much better than forking processes but every system has its limit!! For example Solaris' maximum number of processes is a function of system memory (up to 2 gigabytes) and its maximum threads is 32768[2].
ns_section "ns/server/${servername}" ns_param maxconnections 100 ;# Max connections to put on queue ns_param maxdropped 0 ;# Shut down if dropping too many conns ns_param maxthreads 20 ;# Tune this to scale your server ns_param minthreads 0 ;# Tune this to scale your server ns_param threadtimeout 120 ;# Idle threads die at this rate
For heavy use of nsv commands, try adjusting the buckets:
ns_section "ns/server/${servername}/tcl" ns_param nsvbuckets 8 ;# No. of buckets to hold nsv's
For cache adjustments (ADPs only), try these:
ns_section "ns/server/${servername}/adp" ns_param cache true ;# In-memory cache of ADP's ns_param cachesize [expr 10000*1024]
For cache adjustments (HTML, txt, images), try these:
ns_section "ns/server/${servername}/fastpath" ns_param cache true ;# Enable cache for normal URLs ns_param cachemaxentry 8192 ;# Largest file size allowable in cache ns_param cachemaxsize [expr 10000*1024] ;# Size of fastpath cache ns_param mmap false ;# Use mmap() for cache
Watch your system! Use pstack(1), swap(1M), vmstat(1) (!note 1! ) , and turn on the sar(1) facility (/var/spool/cron/crontabs/sys). At the minimum your swap usage should be zero. Make ps(1) more useful by aliasing it to this:
ps \ -o user="USER" \ -o pid,ppid,pgid \ -o pri=" PR" \ -o nice="NI" \ -o s="S" \ -o time="CPU_TIME" \ -o pcpu="%CPU" \ -o rss=" CORE_MEM" \ -o vsz="TOTAL_MEM" \ -o pmem="%MEM" \ -o etime \ -o comm \ $1 $2 $3 $4 $5 $6 $7 $8 $9
Regards,
Kris
Footnotes:
- [1] Vahalia, "Unix Internals", p. 61.
- [2] Mauro, McDougall, Solaris Internals, p. 289
Notes:
!note 1! Beware, that runing vmstat on a busy system won't show anything that can
help you, because it eats a lot of cpu, makes a lot of context switches, etc, because it reads some data from /proc for _EVERY_ thread/process running on the machine. I would recommend using procinfo instead, and vmstat only when you have no more than, say, 200-300 processes running on the machine. (this is real world experience :) )