vADC Blog

Customized Load-Balancing with Stingray (07/03/2009)

by aknox on ‎07-02-2012 11:04 AM (389 Views)

Customized Load-Balancing with Stingray (07/03/2009)


Choosing the right load-balancing algorithm can often be difficult as every web application behaves differently. Stingray offers a wide variety of built-in load-balancing algorithms to overcome this problem, but if none of these suit your needs then you can just design your own! This article shows you how.


Typically, an application server will start to slow down when it becomes overloaded. Its response times will increase and it will not be able to handle as many connections. Stingray can easily detect these problems and start directing traffic away from that server. Stingray's 'Fastest Response Time' and 'Least Connections' algorithms are two examples that will accomplish exactly this task.

Sometimes, though, there will be a particular application that just doesn't let on when it is busy. It will soldier on with impeccable response times right up until the server crashes from the excessive load. None of Stingray's built-in load balancing algorithms would be able to detect this problem until the server has actually crashed.

Customized Load-Balancing

This is an ideal situation to use a custom load-balancing algorithm that can gather whatever information it likes and use that to influence where Stingray will send its requests. In this case, we would like to use the server's CPU utilization so we can tell Stingray to back off sending requests to a particular server when its resources are running low. The algorithm would therefore be effective regardless of how quickly the server is responding to requests.

To create a customized load-balancing algorithm, you can use the Weighted Round Robin algorithm for your pool and then use Stingray’s SOAP API to alter the weighting of each node in that pool depending on how busy it is. You can gather information from the nodes and make SOAP calls from a program running on the same machine as your Stingray, or on another machine on the network that can access both the SOAP API of your Stingray and the all nodes in the pool that you want to load balance.


The following is a general outline of what such a program should do:

  • Get a list of the nodes in the pool being load-balanced
  • Gather data from each node in the list
  • Process the data to work out which nodes should receive the most traffic
  • Set the weighting of each node in the pool accordingly
  • Wait for a few seconds, then repeat

An Example

This is the source code of a simple Perl program that will find out how busy the CPU of each node in the pool is and use that information to determine what the node's priority should be:

<a href="" target=_blank>*Perl source code*</a>

(save as and make it executable)

The program takes as arguments the location of the SXTM's SOAP interface (this will be the same as the location you browse to when accessing the Administration Server), the username and password used to access the SOAP interface and the name of the pool to which you want to apply the custom load-balancing algorithm.

./ myzxtm:9090 admin admin MyPool

The program has several internal parameters that you can change to customize how requests will be distributed:

Metric: This is the command that will be executed on each node to obtain the information used to determine the nodes' priorities. The command used in this case is:

vmstat 5

This command returns information about the resources being used by a system, such as how much memory is available and how active the CPU is, once every 5 seconds. You can increase or decrease the frequency of the reports if your CPU utilization is likely to change more slowly or more rapidly. We just want to find out how busy the CPU is, but vmstat reports a lot of additional information around this. To extract the information we need, there are additional functions specified in the main body of the code that pull out the relevant information from the output returned by vmstat and ensure it is within a reasonable range. If you want to use a different metric, you can modify this code to extract the relevant data from the output generated by your command.

Ceiling: This number indicates the point at which the node should be given the minimum priority. In this example we use 80 so that when the CPU is 80% utilized it will be assigned the minimum possible priority relative to the other nodes in the pool. This will give the server a bit of leeway to recover once it starts to become overloaded. If the ceiling parameter is set to 0 then the program will assign priorities based on how heavily the server is loaded relative to the other nodes in the pool. This is useful if there is no upper limit to the metric being reported (for example, if it were to report the current number of requests being served).

Granularity: This parameter effectively defines how often the priorities should be updated. If set to 100, then nodes will be assigned priorities between 1 and 100, so if the idle CPU time on a node changes by about 1% then the priority of that node will be changed. In this example, we set the granularity to 10, so only large changes in the idle time result in the program making a SOAP call to Stingray.

When executed, the program will check the pool you specified, get a list of the nodes, obtain the CPU utilization for each of the nodes, calculate the priorities of each node based on that information and then send a command to Stingray to update the weights of the nodes accordingly! It will continue to monitor the CPU usage of the nodes and will update the weightings once every 5 seconds.

SSH Commands

Note that the program uses SSH to run the commands that obtain the CPU utilization for each node. You will have to enter the password for each node every time you start the program unless you configure SSH to allow you to log on without a password. A quick search for 'passwordless ssh' provides plenty of information on how to do this.

Alternatively, if you would prefer not to perform the commands using SSH, you could modify the program to listen for information from the nodes on an internet socket. You would then need to run separate programs on each of the nodes that send the desired information to this socket. Once the program has information from all the nodes it can calculate the priorities as before.