AOLserver Tuning, Troubleshooting and Scaling

From AOLserver Wiki
Revision as of 03:55, 30 January 2009 by Gustafn (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Adjusting AOLserver for current conditions is very straightforward. It will help by adjusting the following parameters. I got these out of my AOLserver Tuning and Scaling Guide: and the Configuration Reference:

For mysterious crashes with no explanation, it's probably due to data corrupted by blown stacks. Multi-threaded applications cannot detect stack overflows and can not increase stack sizes so your data can be corrupted. You can predict this by adding up the size of all your locally-declared data types (do not forget to include the function parameters!!) and adjusting the stacksize parameter. The stacksize can be anything; AOLserver sets it to 65535 bytes on most systems. For large apps, especially apps using deep calls to Tcl procs, this default stacksize is for plain aolserver usually enough. If the stack size is to small, the server crashes. For 64bit systems with e.g. OpenACS running, one has to increase the stack size as done below. Remember, too, that each thread can have a stack of 1 megabyte--that's per *thread*. The kernel cannot increase these threads or sense their out-of-bounds condition like it can do with processes so investigate this parameter[1].

 ns_section "ns/threads"
 ns_param   stacksize       2048000  ;# stack size per thread (in bytes)

For overloaded machines that work fine and suddenly start thrashing, try adjusting AOLserver's use of system resources. Threads are very flexible and use your system's resources much better than forking processes but every system has its limit!! For example Solaris' maximum number of processes is a function of system memory (up to 2 gigabytes) and its maximum threads is 32768[2].

 ns_section "ns/server/${servername}"
 ns_param   maxconnections  100       ;# Max connections to put on queue
 ns_param   maxdropped      0         ;# Shut down if dropping too many conns
 ns_param   maxthreads      20        ;# Tune this to scale your server
 ns_param   minthreads      0         ;# Tune this to scale your server
 ns_param   threadtimeout   120       ;# Idle threads die at this rate

For heavy use of nsv commands, try adjusting the buckets:

 ns_section "ns/server/${servername}/tcl"
 ns_param   nsvbuckets      8         ;# No. of buckets to hold nsv's

For cache adjustments (ADPs only), try these:

 ns_section "ns/server/${servername}/adp"
 ns_param   cache           true      ;# In-memory cache of ADP's
 ns_param   cachesize       [expr 10000*1024]

For cache adjustments (HTML, txt, images), try these:

 ns_section "ns/server/${servername}/fastpath"
 ns_param   cache           true      ;# Enable cache for normal URLs
 ns_param   cachemaxentry   8192      ;# Largest file size allowable in cache
 ns_param   cachemaxsize    [expr 10000*1024] ;# Size of fastpath cache
 ns_param   mmap            false     ;# Use mmap() for cache

Watch your system! Use pstack(1), swap(1M), vmstat(1) (!note 1! ) , and turn on the sar(1) facility (/var/spool/cron/crontabs/sys). At the minimum your swap usage should be zero. Make ps(1) more useful by aliasing it to this:

 ps \
       -o user="USER" \
       -o pid,ppid,pgid \
       -o pri=" PR"  \
       -o nice="NI" \
       -o s="S" \
       -o time="CPU_TIME" \
       -o pcpu="%CPU" \
       -o rss=" CORE_MEM" \
       -o vsz="TOTAL_MEM" \
       -o pmem="%MEM" \
       -o etime \
       -o comm \
       $1 $2 $3 $4 $5 $6 $7 $8 $9




  • [1] Vahalia, "Unix Internals", p. 61.
  • [2] Mauro, McDougall, Solaris Internals, p. 289


  !note 1!
    Beware, that runing vmstat on a busy system won't show anything that can 

help you, because it eats a lot of cpu, makes a lot of context switches, etc, because it reads some data from /proc for _EVERY_ thread/process running on the machine. I would recommend using procinfo instead, and vmstat only when you have no more than, say, 200-300 processes running on the machine. (this is real world experience :) )