vADC Docs

How is memory managed in TrafficScript?

by on ‎03-05-2013 03:08 PM (2,054 Views)

This article describes how TrafficScript manages the memory needed when a rule executes, and how it references connection and global data that is stored outside of a rule's execution environment.  It will help you understand the differences between local variables in TrafficScript, connection-local variables (connection.data.get/set), resource data (resource.get) and global data (data.get/set).


Overview


TrafficScript is a very lightweight compiled language.  TrafficScript rules are compiled into a Stingray-internal bytecode with about a dozen simple stack-based instructions and are executed on an internal 'virtual machine' (code-named 'ichor').  All of the 'heavy lifting' (i.e. all of the trafficscript functions) are implemented by internal Stingray procedures (not by the TrafficScript language) so the performance of TrafficScript is driven by native Stingray performance rather than the performance of the virtual machine bytecode.

The biggest single determinant of performance in an optimized, lightweight virtual machine like ichor is the use of memory.  Ichor goes to great pains to minimize memory copies by use of references and region-based memory management where possible to reduce the overhead as far as possible.

In this article, we'll consider how TrafficScript addresses memory via the String datatype.  Internally, Strings are implemented as references (pointer-and-length) to memory that is often managed outside of the Ichor runtime.  From Ichor's perspective, this memory is read-only; multiple strings can refer to the same memory area and memory is only copied when a new string with new contents is created.  This allows ichor to make assumptions that significantly improve execution speed and memory footprint.

A model of memory

The diagram below outlines five types of memory that TrafficScript can address.

memory.png

Constants

memory1.pngTrafficScript constants (string, integer and double values) are stored with the compiled TrafficScript rule

Stored with the compiled TrafficScript rule: String, integer and double values that are declared in a TrafficScript rule are stored with the rule bytecode and referenced directly.  They are deduped, simply to reduce the memory footprint of the compiled TrafficScript rule.

Local variables and Temporary values

memory2.pngVariables and other temporary values are stored in a temporary memory region that is discarded once the rule has finished executing.

Stored for the scope of the rule execution: Local variables and temporary values that are created during the execution of a TrafficScript rule are stored on the execution stack and in the growable heap used by the TrafficScript virtual machine.


Because string data uses references rather than private copies, code like the following is very efficient:


$body = http.getResponseBody();


if( string.regexMatch( $body, "(.*?)(<body.*?>)(.*)" ) ) {


  $head = $1;


  $bodytag = $2;


  $remainder = $3;


}




No memory copies are made during the regex search and the assignment of the results to $head, $bodytag and $remainder.  These variables simply contain references to substrings within Stingray's internal copy of the response body.

TrafficScript variables and temporaries are only valid for the duration of the rule's execution; persistent data is copied out of the heap as a side-effect of the relevant TrafficScript operations and the heap is safely and quickly discarded in a single operation when the rule execution completes.

Per-connection data

memory3.pngData can be stored with a connection using connection.data.set(), and retrieved by a later rule using connection.data.get().  This is used when sharing state between the rules that process a connection.

Stored for the duration of the connection: A connection that is processed by Stingray uses a variety of memory data structures for various data types.

HTTP headers and other connection data is stored with the connection.  If a TrafficScript rule requests the value of the path or a header (for example), it is given a reference to the connection-local memory containing the path, so there are no memory copies.  If a TrafficScript rule updates the value of the path (for example, using http.setPath()), then the connection-local copy is updated so that the value persists when the TrafficScript rule completes.

In the example above where the TrafficScript rule used http.getResponseBody(), all of the strings refer to the connection-local copy of the body, and this single copy is used by all TrafficScript rules that need to access it in a read-only fashion.

Note: Stingray's HTTP virtual server type abstracts an HTTP transaction from the underlying TCP connection.  Connection-local data is associated with the HTTP transaction and discarded when the transaction completes.

connection.data.set(): You can explicitly store data with a connection-scope using the connection.data.set() trafficscript function.  This places a copy of the data in the connection's growable memory pool and this data can be retrieved by a later TrafficScript rule (connection.data.get()) or a transaction log macro.  The connection's memory pool is discarded in a single operation once the connection has completed.

Per-process data

memory4.pngResource files are stored per-process and can be referenced efficiently using resource.get() and related functions.

The most common per-process data that TrafficScript will address is the contents of resource files.  Resource files sit in the extra section of the Stingray configuration.  They are loaded into memory and stored persistently at startup, and whenever they change on disk.

resource.get() returns a reference to the body of the already-loaded resource file.  In a similar fashion, resource.getMTime() and resource.getMD5() return the pre-calculated values so there is no disk or compute overhead from invoking these functions.

Global data

memory5.pngData can be shared between all rules using the global key-value store.  Access the data using data.get(), data.set() and related functions.

On a multi-core machine, Stingray will typically run one zeus.zxtm traffic manager process per core.  These processes share a fixed-size shared memory segment that is allocated at startup.

This shared memory segment is used for a number of purposes - sharing session persistence data, bandwidth and rate data, the web content cache, etc.  It includes a key-value store called the Global Data Table that you address using TrafficScript functions such as data.set() and data.get().

The Global Data Table is the key memory resource to use if you want to share data between different connections (the alternative is an external solution accessed using a Java Extension or other external callout, or a client-side cookie).  Keys and values are stored as strings (other types are serialized and deserialized on demand), and the size of the table is fixed (trafficscript!data_size) so you must track and discard entries yourself.  Without locking, iterators or memory management, using the global data table effectively can be a challenge.

  • data.set( $key, $value ): put a copy of the key/value pair in the global data table, serializing non-string datastructures where necessary
  • data.get( $key ): return the corresponding value, de-serializing where necessary, or the empty string if $key is not recognised
  • data.remove( $key ): removes the key/value pair from the table, freeing the memory used by both
  • data.reset( [$prefix] ): removes every entry, or just entries where the key begins with $prefix, from the table, freeing memory
  • data.getMemoryUsage() and data.getMemoryFree(): indicate memory usage and can be used to detect impending memory exhaustion

Read more