Difference between revisions of "AOLserver Tuning, Troubleshooting and Scaling"

From AOLserver Wiki
Jump to navigation Jump to search
(imported from WiKit id 67)
 
Line 19: Line 19:
  
 
   ns_section "ns/threads"
 
   ns_section "ns/threads"
   ns_param  stacksize      [[expr 128*1024]] ;# stack size per thread (in bytes)
+
   ns_param  stacksize      2048000 ;# stack size per thread (in bytes)
  
 
For overloaded machines that work fine and suddenly start thrashing, try
 
For overloaded machines that work fine and suddenly start thrashing, try
Line 44: Line 44:
 
   ns_section "ns/server/${servername}/adp"
 
   ns_section "ns/server/${servername}/adp"
 
   ns_param  cache          true      ;# In-memory cache of ADP's
 
   ns_param  cache          true      ;# In-memory cache of ADP's
   ns_param  cachesize      [[expr 5000*1024]]
+
   ns_param  cachesize      [expr 10000*1024]
  
 
For cache adjustments (HTML, txt, images), try these:
 
For cache adjustments (HTML, txt, images), try these:
Line 51: Line 51:
 
   ns_param  cache          true      ;# Enable cache for normal URLs
 
   ns_param  cache          true      ;# Enable cache for normal URLs
 
   ns_param  cachemaxentry  8192      ;# Largest file size allowable in cache
 
   ns_param  cachemaxentry  8192      ;# Largest file size allowable in cache
   ns_param  cachemaxsize    [[expr 5000*1024]] ;# Size of fastpath cache
+
   ns_param  cachemaxsize    [expr 10000*1024] ;# Size of fastpath cache
 
   ns_param  mmap            false    ;# Use mmap() for cache
 
   ns_param  mmap            false    ;# Use mmap() for cache
  

Revision as of 08:51, 30 January 2009

Adjusting AOLserver for current conditions is very straightforward. It will help by adjusting the following parameters. I got these out of my AOLserver Tuning and Scaling Guide: http://aolserver.com/docs/tuning.adp and the Configuration Reference: http://aolserver.com/docs/config.adp.

For mysterious crashes with no explanation, it's probably due to data corrupted by blown stacks. Multi-threaded applications cannot detect stack overflows and can not increase stack sizes so your data can be corrupted. You can predict this by adding up the size of all your locally-declared data types (do not forget to include the function parameters!!) and adjusting the stacksize parameter. The stacksize can be anything; AOLserver sets it to 65535 bytes on most systems (the Solaris 8 limit of 1 megabyte can be too large). For large apps, especially apps using deep calls to Tcl procs, this default stacksize is usually enough. Remember, too, that each thread can have a stack of 1 megabyte--that's per *thread*. The kernel cannot increase these threads or sense their out-of-bounds condition like it can do with processes so investigate this parameter[1].

 ns_section "ns/threads"
 ns_param   stacksize       2048000  ;# stack size per thread (in bytes)

For overloaded machines that work fine and suddenly start thrashing, try adjusting AOLserver's use of system resources. Threads are very flexible and use your system's resources much better than forking processes but every system has its limit!! For example Solaris' maximum number of processes is a function of system memory (up to 2 gigabytes) and its maximum threads is 32768[2].

 ns_section "ns/server/${servername}"
 ns_param   maxconnections  100       ;# Max connections to put on queue
 ns_param   maxdropped      0         ;# Shut down if dropping too many conns
 ns_param   maxthreads      20        ;# Tune this to scale your server
 ns_param   minthreads      0         ;# Tune this to scale your server
 ns_param   threadtimeout   120       ;# Idle threads die at this rate

For heavy use of nsv commands, try adjusting the buckets:

 ns_section "ns/server/${servername}/tcl"
 ns_param   nsvbuckets      8         ;# No. of buckets to hold nsv's

For cache adjustments (ADPs only), try these:

 ns_section "ns/server/${servername}/adp"
 ns_param   cache           true      ;# In-memory cache of ADP's
 ns_param   cachesize       [expr 10000*1024]

For cache adjustments (HTML, txt, images), try these:

 ns_section "ns/server/${servername}/fastpath"
 ns_param   cache           true      ;# Enable cache for normal URLs
 ns_param   cachemaxentry   8192      ;# Largest file size allowable in cache
 ns_param   cachemaxsize    [expr 10000*1024] ;# Size of fastpath cache
 ns_param   mmap            false     ;# Use mmap() for cache

Watch your system! Use pstack(1), swap(1M), vmstat(1) (!note 1! ) , and turn on the sar(1) facility (/var/spool/cron/crontabs/sys). At the minimum your swap usage should be zero. Make ps(1) more useful by aliasing it to this:

 ps \
       -o user="USER" \
       -o pid,ppid,pgid \
       -o pri=" PR"  \
       -o nice="NI" \
       -o s="S" \
       -o time="CPU_TIME" \
       -o pcpu="%CPU" \
       -o rss=" CORE_MEM" \
       -o vsz="TOTAL_MEM" \
       -o pmem="%MEM" \
       -o etime \
       -o comm \
       $1 $2 $3 $4 $5 $6 $7 $8 $9


Regards,

Kris

Footnotes:

  • [1] Vahalia, "Unix Internals", p. 61.
  • [2] Mauro, McDougall, Solaris Internals, p. 289


Notes:

  !note 1!
    Beware, that runing vmstat on a busy system won't show anything that can 

help you, because it eats a lot of cpu, makes a lot of context switches, etc, because it reads some data from /proc for _EVERY_ thread/process running on the machine. I would recommend using procinfo instead, and vmstat only when you have no more than, say, 200-300 processes running on the machine. (this is real world experience :) )