vADC Docs

Managing consistent caches across a Stingray Cluster - the remix!

by on ‎04-19-2013 10:39 AM - edited on ‎07-08-2015 04:41 PM by PaulWallace (669 Views)

The article Managing consistent caches across a Stingray Cluster describes in detail how to configure a pair of Stingray devices to operate together as a fully fault-tolerant cache.

 

The beauty of the configuration was that it minimized the load on the origin servers - content would only be requested from the origin servers when it had expired on both peers, and a maximum of one request per 15 seconds (configurable) per item of content would be sent to the origin servers:

 

ccache2.png

 

The solution uses two Stingray Traffic Managers, and all incoming traffic is distributed to one single front-end traffic manager.

 

How could we extend this solution to support more than two traffic managers (for very high-availability requirements) with multiple active traffic managers?

 

Overview

 

The basic architecture of the solution is as follows:

 

  • We begin with a cluster of 3 Stingray Traffic Managers, named stm-1, stm-2 and stm-3, with a multi-hosted IP address distributing traffic across the three traffic managers
  • Incoming traffic is looped through all three traffic managers before being forwarded to the origin servers; the return traffic can then be cached by each traffic manager

Screen Shot 2013-04-19 at 18.14.07.png

 

  • If any of the traffic managers have a cached version of the response, they respond directly

 

Configuration

 

Starting from a working cluster.  In this example, the names 'stm-1', 'stm-2' and 'stm-3' resolve to the permanent IP addresses of each traffic manager; replace these with the hostnames of the machines in your cluster.  The origin servers are webserver1, webserver2 and webserver3.

 

Step 1: Create the basic pool and virtual server

 

Create a pool named 'website0', containing the addresses of the origin servers.

Screen Shot 2013-04-19 at 18.22.35.png

Create a virtual server that uses the 'discard' pool as its default pool.  Add a request rule to select 'website0':

 

pool.use( "website0" );

 

... and verify that you can browse your website through this virtual server.

 

Step 2: Create the additional pools

 

You will need to create N * (N-1) additional pools if you have N traffic managers in your cluster.

 

Pools website10, website20 and website30 contain the origin servers and either node stm-1:80, stm-2:80 or stm-3:80.  Edit each pool and enable priority lists so that the stm node is used in favor to the origin servers:

 

plists.png

Configuration for Pools website10 (left), website20 (middle) and website30 (right)

 

Pools website230, website310 and website120 contain the origin servers and two of nodes stm-1:80, stm-2:80 or stm-3:80.  Edit each pool and enable priority lists so that the stm nodes are each used in favor to the origin servers.

 

For example, pool website310 will contain nodes stm-3:80 and stm-1:80, and have the following priority list configuration:

 

Screen Shot 2013-04-19 at 18.38.37.png

 

Step 3: Add the TrafficScript rule to route traffic through the three Stingrays

 

Enable trafficscript!variable_pool_use (Global Settings > Other Settings), then add the following TrafficScript request rule:

 

# Consistent cache with multiple active traffic managers

$tm = [

   'stm-1' => [ 'id' => '1', 'chain' => '123' ],

   'stm-2' => [ 'id' => '2', 'chain' => '231' ],

   'stm-3' => [ 'id' => '3', 'chain' => '312' ]

];


$me = sys.hostname();

$id = $tm[$me]['id'];


$chain = http.getHeader( 'X-Chain' );

if( !$chain ) $chain = $tm[$me]['chain'];


log.info( "Request " . http.getPath() . ": ".$me.", id ".$id.": chain: ".$chain );


do {

   $i = string.left( $chain, 1 );

   $chain = string.skip( $chain, 1 );

} while( $chain && $i != $id );


log.info( "Request " . http.getPath() . ": New chain is ".$chain.", selecting pool 'website".$chain."0'");


http.setHeader( 'X-Chain', $chain );

pool.use( 'website'.$chain.'0' );

 

Leave the debugging 'log.info' statements in for the moment; you should comment them out when you deploy in production.

 

How does the rule work?

 

When traffic is received by a Traffic Manager (for example, the traffic manager with hostname stm-2), the rule selects the chain of traffic managers to process that request - traffic managers 2, 3 and 1.

 

  • It updates the chain by removing '2' from the start, and then selects pool 'website310'.

 

  • This pool selects stm-3 in preference, then stm-1 (if stm-3 has failed), and finally the origin servers if both devices have failed.

 

  • stm-3 will process the request, check the chain (which is now '31'), remove itself from the start of the chain and select pool 'website10'.

 

  • stm-1 will then select the origin servers.

 

This way, a route for the traffic is threaded through all of the working traffic managers in the cluster.

 

Testing the rule

 

You should test the configuration with a single request.  It can be very difficult to unravel multiple requests at the same time with this configuration.

 

Note that each traffic manager in the cluster will log its activity, but the merging of these logs is done at a per-second accuracy, so they will likely be misordered.  You could add a 'connection.sleep( 2000 )' in the rule for the purposes of testing to avoid this problem.

 

Enable caching

 

Once you are satisfied that the configuration is forwarding each request through every traffic manager, and that failures are appropriately handled, then you can configure caching.  The details of the configuration are explained in the Managing consistent caches across a Stingray Cluster article:

 

cachesettings.png

 

Test the configuration using a simple, repeated GET for a cacheable object:

 

$ while sleep 1 ; do wget http://192.168.35.41/zeus/kh/logo.png ; done

 

Just as in the Consistent Caches article, you'll see that all Stingrays have the content in their cache, and it's refreshed from one of the origin servers once every 15 seconds:

Screen Shot 2013-04-19 at 17.47.26.png

 

Notes

 

This configuration used a Multi-Hosted IP address to distribute traffic across the cluster.  It works just as well with single hosted addresses, and this can make testing somewhat easier as you can control which traffic manager receives the initial request.

 

You could construct a similar configuration using Failpools rather than priority lists.  The disadvantage of using failpools is that Stingray would treat the failure of a Stingray node as a serious error (because an entire pool has failed), whereas with priority lists, the failure of a node is reported as a warning.  A warning is more appropriate because the configuration can easily accommodate the failure of one or two Stingray nodes.

 

Performance should not be unduly affected by the need to thread requests through multiple traffic managers.  All cacheable requests are served directly by the traffic manager that received the request.  The only requests that traverse multiple traffic managers are those that are not in the cache, either because the response is not cacheable or because it has expired according to the 'one check every 15 seconds' policy.

Contributors