vADC Forum

Reply
New Contributor
Posts: 2
Registered: ‎11-13-2013

autoscaling traffic manager pools with aws

Hi,


I'm trying to understand how autoscaling with pools work, specifically with AWS. Can someone explain to me exactly what the following options mean:


autoscale!enabled and autoscale!external

Please do not just direct me to the documentation as it does not in any way explain how this actually works. (I am looking at the "Stingray Traffic Manager User Manual version 9.4").


If for example I have autoscale!external enabled does this mean that the traffic manager is passive and only updates instances in the pool after it is triggered by something? If so then what is this something? Is it AWS? If so then how is this possible? I am not aware of AWS autoscaling having this capability at all. Or do I have to have something that updates the pool itself after the autoscaling group changes? (So basically monitor the autoscale group and then if there's a change update the pool). If so then what is the point of even having an autoscale aware pool? Why not just add/remove nodes via the api? Shouldn't traffic manager be able to do this itself? I don't see any way of configuring an autoscale group to monitor so I have no idea how it can do this. Unless I am missing something here?


If I have autoscale!external disabled then what exactly is supposed to happen? Does traffic manager create an autoscale group and a launchconfig and manage those things? If so then how? I don't see enough parameters to do this. How does it know which VPC to do this in, which subnets the instances should be launched on? How do I attach a user data script for the instances?


Please excuse my tone. I just want to get this to work as it is rather expensive to run in AWS and if you guys are advertising that autoscaling works for AWS then IMO it should.


Thanks

New Contributor
Posts: 2
Registered: ‎11-13-2013

Re: autoscaling traffic manager pools with aws

Update:

So I thought I'd give it another shot and decided to try setting autoscale!external to no. I added an AMI id, specified an instance size and passed a SubnetId that was in a VPC. I clicked update and watched the Event Log. I had set the max size to 4 and saw this message:


"Pool code: An autoscaled pool's state has changed from 'waiting to shrink' to 'must shrink'; have 4 nodes min=2 max=4 conf=100"

So that looked promising. I still had no idea what kind of autoscale group it was managing. Was it a proper AWS autoscale group or was it simply tagging a bunch of instances it created to track them?  No idea. The next event was very disconcerting:

"Pool code: Destruction of node with id i-0b101550 and address 172.16.73.53 is now complete"

Ok great. What did it exactly destroy? I looked up the instance id and it had decided to destroy one of our pre production servers that was in an autoscale group. Great job. How did traffic manager decided to destroy this node? What is this logic based on? Why does the web UI not give any details about how this works or should be configured? WTF is going on here guys? This is really really horrible. The subnet id does correspond with that ip address so did traffic manager just decided to start treating that entire subnet as an autoscale group and to begin managing it? Again, I have no idea. Documentation does not explain what is actually going on nor how to properly configure it. Web UI is very lacking in clarity. I never would have expected that this thing would actually start shutting terminating running instances. Isn't this a gigantic problem?

All I can say is WTF guys. For real. How in the hell did this make it into AWS marketplace? This is totally beta shit.



Brocadian
Posts: 227
Registered: ‎11-29-2012

Re: autoscaling traffic manager pools with aws

Matt,

     I would like to start by saying we are sorry you have had a less than stellar introduction to auto-scaling... I can say that it does indeed work in AWS and we have many customers who use it to great success.  I will also work through the manual elements with you, if you want, as we are always happy to receive feedback on how we can make our documentation easier to understand!


Cost of testing

I am not sure if you are aware that you can run Stingray Traffic Manager Developer License (full feature, bandwidth limited to 1Mb/s) for just this sort of testing.  The Developer Edition is available from the marketplace 0 simply search for "Stingray Developer".  If your testing is low CPU, even though they are smaller than the officially supported platforms (min 2 gig ram), you can even run a developer edition on a free tier eligible systems in a pinch.

To answer a few of your questions one at a time, and hopefully get your trial working in the right direction.  I have included the text from the manual in ITALIC below and will elaborate a little also:


autoscale!enabled


Enables or disables auto-scaling functionality. When this is enabled, nodes will be added and removed from the pool using the mechanism specified by autoscale!external.



So autoscaled!enabled just controls whether autoscale is enabled or disabled for the pool in question.


autoscale!external


Some cloud providers have their own mechanism for performing auto-scaling. In this case, a cloud provider would monitor the performance for a group of machines (those being the machines in the pool) and will add and remove machines as needed. The cloud provider will indicate to the traffic manager that it has added or removed a machine using its API. When the traffic manager has been informed of the change, it will add or remove nodes from the auto-scaling pool accordingly.




If you intend to use your cloud provider's auto-scaling mechanism, set this parameter to yes. If you intend to use the traffic manager's auto-scaling mechanism, set this to no.


So autoscale!external is relying on an external scale engine to do the actual scaling (such as some of your code taking a feed from AWS cloudwatch metrics for example).  As you can imagine, stingray runs in a vast number of clouds, and some of these clouds have autoscale drivers written to interface with a Stingray Traffic Manager instance. 


Introduction


The application auto-scaling option monitors the performance of a service running on a supported virtual or cloud platform. When the performance falls outside the desired service level, Stingray can then initiate an auto-scaling action, requesting that the platform deploys additional instances of the service. Stingray will automatically load balance traffic to the new instances as soon as they are available. The auto- scaling feature consists of a monitoring and decision engine, and a collection of driver scripts that interface with the relevant platform.


Auto-scaling is a property of a pool. If enabled, you do not need to provide a specific list of nodes in the pool configuration. Instead, if performance starts to degrade, additional nodes can be requisitioned automatically to provide the extra capacity required. Conversely, a pool can be scaled back to free up additional nodes when they are not required. Hence this feature can be used to dynamically react to both short bursts of traffic or long-term increases in load.


A built-in service monitor is used to determine when a pool needs to be auto-scaled up or down. The service level (i.e. response time) delivered by a pool is monitored closely. If the response time falls outside the desired level, then auto-scaling will add or remove nodes from the pool to increase or reduce resource in order to meet the service level at the lowest cost.


In brief, Auto-Scale uses a monitor to scale the number of nodes in your pool up or down to match observed demand.


How It Works


As mentioned above, the auto-scaling mechanism consists of a Decision Engine and a collection of platform-dependent Driver scripts.




The Decision Engine


This monitors the response time from the pool, and provides scale-up/scale-down thresholds. Other parameters control the minimum and maximum number of nodes in a pool, and the length of time the traffic manager will wait for the response time to stabilize once a scale-up or scale-down is completed.


For example, you may wish to maintain an SLA of 250ms. You can instruct the traffic manager to scale up (add nodes) if less than 50% of transactions are completed within this SLA, up to a maximum of 10 backend nodes. Alternatively, it should scale-down (remove nodes) progressively to a minimum of 1 node if more than 95% of transactions are completed.



Note: You can manually provision nodes by editing the max-nodes and min-nodes settings in the pool. If Stingray notices that there is a mismatch between the max/min and the actual number of nodes active, then it will initiate a series of scale- up or scale-down actions.



The Cloud API Driver


Stingray includes API driver scripts for Amazon EC2, Rackspace and VMware vSphere cloud environments. Before you can create an auto-scaling pool, you must first create a set of cloud credentials pertaining to the cloud API you wish to use. These credentials contain the information required to allow a traffic manager to communicate with the aforementioned cloud providers. The precise credentials used will depend on the cloud provider that you specify. See the Cloud Credentials section of CHAPTER 7 for details.



The decision engine initiates a scale-up or scale-down action by invoking the driver with the configured credentials and parameters. The driver instructs the virtualization layer to deploy or terminate a virtual machine. Once the action is complete, the driver returns the new list of nodes in the pool and the decision engine updates the pool configuration.


So in your case, first you would need to create "Cloud Credentials" for AWS under Catalogs > Cloud Credentials

If you have not created any when you create an pool that is autoscaled, we put a link to the cloud credentials section for you in the GUI:

SRWare Iron057.png

Once you have defined your EC2 credentials and selected these as the ones to use for your autoscaling pool, the relevant AWS settings appear.  In the "Extra Arguments" section, you can put your SubnetIDs etc :

SRWare Iron058.png

So looking at your next issue where autoscale destroyed a node I have a question:

What EC2 AMI did you give the autoscale pool?  The normal process for autoscaling is to provide an AMI that will be booted into a working state, either because it powers up workig (ie: static content), through orchestration (like puppet or chef) or where the launch variables trigger some sort of auto configuration script.

Autoscale doesn't "take over" a subnet, but rather create instances of the listed AMI to meet its "autoscale!min_nodes" requirement up to its "autoscale!max_nodes" setting, using the settings of the SLA defined in the next section to trigger scale up and scale down events:

SRWare Iron059.png

I am sure we could get to the bottom of this if you raised a support case.  Are you able to do this?

Otherwise perhaps it would be best to test the autoscale feature out in a sandbox first so you are able to assess how it will work in your environment.  Is this an option?

As always, let us know if you need help with anything else, or you can always raise a support case at support.riverbed.com.

Let us know how you get on...

Cheers.

Aidan.

New Contributor
Posts: 3
Registered: ‎12-18-2014

Re: autoscaling traffic manager pools with aws

Hi Aidan,

     These are a useful set of instructions (thanks). However I would argue that this product is still does not support AWS. I'm using 9.8r2 and have configured it based on the product manual and your comments above. I'm happy with the autoscaling configuration however when the traffic manager adds a node to the pool, it sends client traffic to the new instance well before the node has even completed the status checks. This means the end user is presented with a service not available message when the system is just autoscaling. This occurs regardless of the type of monitoring used i.e. I monitor the http service and the instance is still added to the pool before it is ready to service requests. I have raised a formal support ticket and also raised the issue here Riverbed SteelApp autoscaling on AWS - Fit for purpose? but unfortunately I am unable to get and traction from Riverbed on this issue. There is no way that I could consider this a mature product for AWS or make the claim that it effectively manages autoscaling on AWS yet.

Thanks,

Chris

New Contributor
Posts: 3
Registered: ‎12-18-2014

Re: autoscaling traffic manager pools with aws

Hi Aidan,

     These are a useful set of instructions (thanks). However I would argue that this product is still does not support AWS. I'm using 9.8r2 and have configured it based on the product manual and your comments above. I'm happy with the autoscaling configuration however when the traffic manager adds a node to the pool, it sends client traffic to the new instance well before the node has even completed the status checks. This means the end user is presented with a service not available message when the system is just autoscaling. This occurs regardless of the type of monitoring used i.e. I monitor the http service and the instance is still added to the pool before it is ready to service requests. I have raised a formal support ticket and also raised the issue here Riverbed SteelApp autoscaling on AWS - Fit for purpose? but unfortunately I am unable to get and traction from Riverbed on this issue. There is no way that I could consider this a mature product for AWS or make the claim that it effectively manages autoscaling on AWS yet.

Thanks,

Chris

Brocadian
Posts: 227
Registered: ‎11-29-2012

Re: autoscaling traffic manager pools with aws

Just updating this thread for the sake of completeness.  We have been working with Chris in the background through the Riverbed TAC to provide a solution to this problem.

If anyone else is hitting this issue, I ask you to please create a support case with the Riverbed TAC (remembering to set the appropriate priority on the case to reflect the impact on your business).

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.