vADC Docs

Tuning Stingray Traffic Manager for best performance

by on ‎02-21-2013 10:35 AM (11,311 Views)

This document describes performance-related tuning you may wish to apply to a production Stingray Traffic Manager software, virtual appliance or cloud instance.  For related documents (e.g. operating system tuning), start with the Tuning Stingray Traffic Manager article.

Tuning Stingray Traffic Manager

Stingray will auto-size the majority of internal tables based on available memory, CPU cores and operating system configuration.  The default behavior is appropriate for typical deployments and it is rarely necessary to tune it.

Several changes can be made to the default configuration to improve peak capacity if necessary. Collectively, they may give a 5-20% capacity increase, depending on the specific test.

Basic performance tuning

Global settings

Global settings are defined in the ‘System’ part of the configuration.

  • Recent Connections table: Set recent_conns to 0 to prevent Stingray from archiving recent connection data for debugging purposes
  • Verbose logging: Disable flipper!verbose, webcache!verbose and gslb!verbose to disable verbose logging.

Virtual Server settings

Most Virtual Server settings relating to performance tuning are to be found in the Connection Management section of the configuration.

  • X-Cluster-Client-IP: For HTTP traffic, Zeus Traffic Manager adds an 'X-Cluster-Client-IP' header containing the remote client's IP address by default.  You should disable this feature if your back-end applications do not inspect this header.
  • HTTP Keepalives: enable support for Keepalives; this will reduce the rate at which TCP connections must be established and torn down.  Not only do TCP handshakes incur latency and additional network traffic, but closed TCP connections consume operating system resources until TCP timeouts are hit.
  • UDP Port SMP: set this to 'yes' if you are managing simple UDP protocols such as DNS.  Otherwise, all UDP traffic is handled by a single Stingray process (so that connections can be effectively tracked)

Pool settings

  • HTTP Keepalives: enable support for Keepalives (Pool: Connection Management; see Virtual Server note above).  This will reduce the load on your back-end servers and the Stingray system.
  • Session Persistence: Session Persistence overrides load balancing and can prevent the traffic manager from selecting the optimal node and applying optimizations such as LARD .  Use session persistence selectively and only apply to requests that must be pinned to a node.

Advanced Performance Tuning

General Global Settings

Maximum File Descriptors (maxfds): File Descriptors are the basic operating system resource that Stingray consumes.  Typically, Stingray will require two file descriptors per active connection (client and server side) and one file descriptor for each idle keepalive connection and for each client connection that is pending or completing.

Stingray will attempt to bypass any soft per-process limits (e.g. those defined by ulimit) and gain the maximum number of file descriptors (per child process).  There are no performance impacts, and minimal memory impact to doing this.  You can tune the maximum number of file descriptors in the OS using fs.file-max

The default value of 1048576 should be sufficient.  Stingray will warn if it is running out of file descriptors, and will proactively close idle keepalives and slow down the rate at which new connections are accepted.

Listen queue size (listen_queue_size): this should be left to the default system value, and tuned using somaxconn (see above)

Number of child processes (num_children): this is auto-sized to the number of cores in the host system.  You can force the number of child processes to a particular number (for example, when running Stingray on a shared server) using the tunable ‘num_children’ which should be added manually to the global.cfg configuration file.

Tuning Accept behavior

The default accept behavior is tuned so that child processes greedily accept connections as quickly as possible.  With very large numbers of child processes, if you see uneven CPU usage, you may need to tune the multiple_accept, max_accepting and accepting_delay values in the Global Settings to limit the rate at which child processes take work.

Tuning network read/write behaviour

The Global Settings values so_rbuff_size and so_wbuff_size are used to tune the size of the operating system (kernel-space) read and write buffers, as restricted by the operating system limits /proc/sys/net/core/rmem_max and /proc/sys/net/core/wmem_max.

These buffer sizes determine how much network data the kernel will buffer before refusing additional data (from the client in the case of the read buffer, and from the application in the case of the write buffer).  If these values are increased, kernel memory usage per socket will increase.

In normal operation, Stingray will move data from the kernel buffers to its user-space buffers sufficiently quickly that the kernel buffers do not fill up.  You may want to increase these buffer sizes when running under connection high load on a fast network.

The Virtual Server settings max_client_buffer and max_server_buffer define the size of the Stingray (user-space) read and write buffers, used when Stingray is streaming data between the client and the server.  The buffers are temporary stores for the data read from the network buffers.  Larger values will increase memory usage per connection, to the benefit of more efficient flow control; this will improve performance for clients or servers accessing over high-latency networks.

The value chunk_size controls how much data Stingray reads and writes from the network buffers when processing traffic, and internal application buffers are allocated in units of chunk_size.  To limit fragmentation and assist scalability, the default value is quite low (4096 bytes); if you have plenty of free memory, consider setting it to 8192 or 16384.  Doing so will increase Stingray’s memory footprint but may reduce the number of system calls, slightly reducing CPU usage (system time).

You may wish to tune the buffer size parameters if you are handling very large file transfers or video downloads over congested networks, and the chunk_size parameter if you have large amounts of free memory that is not reserved for caching and other purposes.

Tuning SSL performance

In general, the fastest secure ciphers that Stingray supports are SSL_RSA_WITH_RC4_128_SHA and SSL_RSA_WITH_RC4_128_MD5.  These are enabled by default.

SSL uses a private/public key pair during the initial client handshake.  1024-bit keys are approximately 5 times faster than 2048-bit keys (due to the computational complexity of the key operation), and are sufficiently secure for applications that require a moderate degree of protection.

SSL sessions are cached locally, and shared between all traffic manager child processes using a fixed-size (allocated at start-up) cache.  On a busy site, you should check the size, age and miss-rate of the SSL Session ID cache (using the Activity monitor) and increase the size of the cache (ssl!cache!size) if there is a significant number of cache misses.

Tuning from-Client connections

Timeouts are the key tool to controlling client-initiated connections to the traffic manager:

  • connect_timeout discards newly-established connections if no data is received within the timeout;
  • keepalive_timeout holds client-side keepalive connections open for a short time before discarding them if they are not reused;
  • timeout is a general-purpose timeout that discards an active connection if no data is received within the timeout period.

If you suspect that connections are dropped prematurely due to timeouts, you can temporarily enable the Virtual Server setting log!client_connection_failures to record the details of dropped client connections.

Tuning to-Server connections

When processing HTTP traffic, Stingray uses a pool of Keep-Alive connections to reuse TCP connections and reduce the rate at which TCP connections must be established and torn down.  If you use a webserver with a fixed concurrency limit (for example, Apache with its MaxClients and ServerLimit settings ), then you should tune the connection limits carefully to avoid overloading the webserver and creating TCP connections that it cannot service.

Pool: max_connections_pernode: This setting limits the total number of TCP connections that this pool will make to each node; keepalive connections are included in that count. Stingray will queue excess requests and schedule them to the next available server.  The current count of established connections to a node is shared by all Stingray processes.

Pool: max_idle_connections_pernode: When an HTTP request to a node completes, Stingray will generally hold the TCP connection open and reuse it for a subsequent HTTP request (as a KeepAlive connection), avoiding the overhead of tearing down and setting up new TCP connections.  In general, you should set this to the same value as max_connections_pernode, ensuring that neither setting exceeds the concurrency limit of the webserver.

Global Setting: max_idle_connections: Use this setting to fine-tune the total number of keepalive connections Stingray will maintain to each node.  The idle_connection_timeout setting controls how quickly keepalive connections are closed.You should only consider limiting the two max_idle_connections settings if you have a very large number of webservers that can sustain very high degrees of concurrency, and you find that the traffic manager routinely maintains too many idle keepalive connections as a result of very uneven traffic.

When running with very slow servers, or when connections to servers have a high latency or packet loss, it may be necessary to increase the Pool timeouts:

  • max_connect_time discards connections that fail to connect within the timeout period; the requests will be retried against a different server node;
  • max_reply_time discards connections that fail to respond to the request within the desired timeout; requests will be retried against a different node if they are idempotent.

When streaming data between server and client, the general-purpose Virtual Server ‘timeout’ setting will apply.  If the client connection times out or is closed for any other reason, the server connection is immediately discarded.

If you suspect that connections are dropped prematurely due to timeouts, you can enable the Virtual Server setting log!server_connection_failures to record the details of dropped server connections.

Nagle’s Algorithm

You should disable “Nagle’s Algorithm” for traffic to the backend servers, unless you are operating in an environment where the servers have been explicitly configured not to use delayed acknowledgements.  Set the node_so_nagle setting to ‘off’ in the Pool Connection Management configuration.

If you notice significant delays when communicating with the back-end servers, Nagle’s Algorithm is a likely candidate.

Other settings

Ensure that you disable or de-configure any Stingray features that you do not need to use, such as health monitors, session persistence, TrafficScript rules, logging and activity monitors.  Disable debug logging in service protection classes, autoscaling settings, health monitors, actions (used by the eventing system) and GLB services.

For more information, start with the Tuning Stingray Traffic Manager article.