Campus Networks

Campus Network Solution, Best Practice-sFlow Monitoring for Brocade Products

by ‎05-20-2013 03:19 PM - edited ‎08-06-2014 08:44 AM (6,072 Views)

 

SYNOPSIS: Review of how to use sFlow, an industry standard monitoring technology included in many Brocade products, to simplify network management and traffic monitoring.

Contents

Preface

Overview

For an IT manager keeping ahead of the growing demand for network resources is vital but simply adding capacity is not the answer, networks must be optimized to ensure that all resources are being used as efficiently as possible so usage trends need to be tracked and responded to intelligently.

sFlow allows IT managers to gain visibility into every element of a network regardless of the speed or complexity of the system being monitored. It delivers data that allows every aspect of a network to be proactively managed thus optimizing efficiency and ensuring that services are not compromised. Typical applications for sFlow include:

  • Identifying and trouble-shooting network issues
  • Traffic flow monitoring and management
  • Usage measurement and accounting
  • Trend analysis, including tracking changes in application use
  • Detecting malicious network activity and tracing the source of an attack

sFlow delivers the scalability required to allow even the largest networks to be monitored and measured quickly and efficiently.

 

Purpose of This Document

This document provides best practices for using sFlow with Brocade products.

 

Audience

Network designers who need to understand how to use the sFlow service embedded in Brocade products with sFlow monitoring software.

 

Related Documents

References

 

Key Contributors

The content in this guide was developed by the following key contributors.

  • Lead Engineer: Simon Pollard, Product Manager

Document History

Date                  Version        Description

2013-04-09         1.0                Initial Release

 

About Brocade

Brocade® (NASDAQ: BRCD) networking solutions help the world’s leading organizations transition smoothly to a world where applications and information reside anywhere. This vision is designed to deliver key business benefits such as unmatched simplicity, non-stop networking, application optimization, and investment protection.

Innovative Ethernet and storage networking solutions for data center, campus, and service provider networks help reduce complexity and cost while enabling virtualization and cloud computing to increase business agility.

To help ensure a complete solution, Brocade partners with world-class IT companies and provides comprehensive education, support, and professional services offerings. (www.brocade.com)

 

What is sFlow?

sFlow is an industry standard network and system monitoring technology. sFlow standards are administered by sFlow.org which is an independent consortium of product vendors and end-users. sFlow is not a product but the definition of a data collection architecture composed of a number of components.

 

sFlow Component

Description

sFlow Agent

The sFlow Agent is embedded in a switch or router, either in software or hardware, which captures traffic samples from the device or network element that it is monitoring. There is no limit to the type of device that the sFlow Agent can be implemented on, it will work just as well on low end Layer 2 switches as it will on the most sophisticated high-end layer 3 routers.

sFlow Datagram

This is the data packet used to transport collected data between the Agent and the Collector.

sFlow Collector

The Collector is a device such as a network management system or server where sFlow datagrams are sent by the Agent.

sFlow Analyzer

The Analyzer takes the data held in the Collector and uses it to create charts and reports that characterize network behavior. Analysis can be real-time or historical.

 

Note: The sFlow collector and analyzer functions are often combined into a single platform.

 

BestPractice_SFlowMonitoring-Architecture.jpg

   sFlow Architecture with Components

 

Key Benefits

The following are important benefits from using sFlow for network monitoring.

 

  • Flexible. sFlow was conceived and designed specifically to meet the continually evolving requirements of packet networks. It is protocol independent and as such can be used to monitor IPv4 and IPv6 as well as many legacy protocols.
  • Low Network Impact. The sFlow sampling infrastructure is implemented in hardware which maximizes performance and because not every packet will be interrogated sample rates can be adjusted to further minimize the impact on the host networking device and also limit the volume of sFlow datagrams generated. The sFlow overhead is typically less than 0.02% even at aggressive sampling rates.
  • Scalable. Large numbers of links and their associated flows can be monitored at wire-rate regardless of speed and without impacting the performance of the network. Because of its very low impact on the network sFlow can be used to continually collect data from all links thus ensuring that real-time data is always available for trend analysis and root-cause analysis of network issues.
  • True End-to-End Visibility. sFlow can be applied to any link so every part of a network can be monitored and thus sources of congestion, policy violations and security threats can be quickly identified wherever they occur. A network that cannot be monitored and analyzed is vulnerable. Monitoring is not limited to physical links, networks that employ virtualization techniques such as Virtual Leased Lines (VLL), Virtual Private LAN Service (VPLS), and Virtual Routing and Forwarding (VRF) can use sFlow technology to monitor virtual endpoints. Multi-Protocol Label Switching (MPLS) Virtual Circuit and endpoint interfaces for protocol tunnels can also be monitored via sFlow.
  • Broad Adoption. Because sFlow is a standard it has been implemented by a large number of switch and router manufacturers as well as many network management vendors. This means that sFlow can be used in networks that are comprised of products from multiple manufacturers eliminating the limitations associated with single vendor lock-in.

 

How sFlow Works

sFlow is a packet sampling technology and as such it collects data from a random selection of the data passing through an interface. By randomizing the samples synchronization with traffic patterns can be avoided and while the resulting data is not 100% accurate the errors can be quantified and therefore accounted for in any analysis. For details of the calculations used to determine sampling rates and the resulting accuracy please refer to the Billing and Accounting section below.

 

In order to ensure wire speed performance with minimal processing overhead packet sampling is normally performed by the forwarding ASIC within a switch or router. The first 128 bytes of a sampled packet is passed to the sFlow agent which combines this with information from the forwarding and routing tables to create a record which is encapsulated in a datagram (UDP packet) which contains some or all of the following information:

 

  • Packet header
  • Source and destination interfaces
  • Sampling process parameters (e.g. rate, pool, etc.)
  • Forwarding information (IPv4 and IPv6)
    • Source and destination 802.1p
    • Source and destination 802.1Q
    • Source and destination mask
    • Next hop router
    • Full BGP AS path
    • BGP communities and local preferences
    • IP address of the next hop router
    • Outgoing VLAN ID
    • Source IP address prefix length
    • Destination IP address prefix length
  • MPLS Forwarding
    • Tunnel, VC, FEC, FTN and VLAN encapsulations
  • User IDs (source and destination)
    • TACACS
    • RADIUS
  • Interface statistics
    • RFC1573, RFC2233, RFC2358
    •  

The completed datagrams are immediately sent to the sFlow Collector where they can be used for real-time reporting and/or stored for later analysis. Immediate transmission of the datagram ensures that the processor and memory overhead is minimized within the switch or router while also guaranteeing that the sFlow collector always has an accurate real-time view of the entire network.

 

Brocade sFlow Implementation

Brocade was the first networking vendor to recognize the power of sFlow and has implemented it on a broad range of platforms from the latest ICX range of switches to the high-end MLXe terabit router to deliver packet sampling on a wide range of interfaces from 10Mbps to 100Gbps. In all cases the implementation is in hardware and on chassis based systems such as the MLXe sFlow is embedded in the line cards thus guaranteeing the performance and scalability of the solution. Configuration of the sFlow agent within a Brocade switch or router is effected via the CLI.

 

sFlow configuration, collection and analysis functions have been integrated into Brocade Network Advisor (BNA) which allows it to be used to monitor the traffic on a real-time basis in order to quickly identifying the problems in the network. Fully flexible and customizable pre-canned reports are available which exploit the stored sFlow data and in addition to this users can create their own reports using a selection of templates. BNA can also export the collected data in a number of formats for analysis on external systems. For more details please refer to the section Collecting And Analyzing sFow Data below.

 

 

sFlow Deployment

The sFlow capabilities embedded within every Brocade switch and router are ideally placed for continuous data collection across the network and negate the need for external probes. The result is a system that delivers a comprehensive and flexible solution which imposes minimal overhead on the network yet allows every connection to be monitored to deliver a highly accurate and detailed picture of network traffic and its profile.

 

Enabling and configuring sFlow on Brocade FastIron switches is a simple operation that is typically completed via the switch CLI. There are a small number of tasks that are mandatory plus some additional which are optional depending on the level of customization required.

 

Mandatory Tasks

1. Specify collector information.

Up to four collectors can be specified. To specify an sFlow collector on an IPv4 device with IP address <IP Address>, enter the following command. The standard port for sending sFlow datagrams is UDP port 6343.

     Brocade(config)#sflow destination <IP Address>

 

To specify an IPv6 address for the sFlow collector, enter the following command.

         Brocade(config)#sflow destination ipv6 <IP Address>

 

2. Enable sFlow globally.

To globally enable sFlow forwarding within a device, enter the following command:

     Brocade(config)#sflow enable

 

3. Enable sFlow forwarding on individual interfaces.

To enable sFlow forwarding for specific interfaces, Ethernet ports 1/1 thru 1/8, enter the following:

     Brocade(config)#interface ethernet 1/1 to 1/8

     Brocade(config-mif-1/1-1/8)#sflow forwarding

 

4. Enable sFlow forwarding on individual trunk ports.

To add sFlow add sFlow forwarding on trunk port Ethernet 4/2.

     Brocade(config)#trunk e 4/1 to 4/8

     Brocade(config-trunk-4/1-4/8)#config-trunk-ind

     Brocade(config-trunk-4/1-4/8)#sflow forwarding e 4/2

 

Note 1: Although it is possible to implement sFlow on selected interfaces and trunk ports Brocade recommends enabling it globally as the benefits greatly outweigh the negligible system and network impact.

 

Note 2: sFlow data must be exported from a FastIron switch via a network port; it cannot be passed out of a switch via the management port, i.e., the sFlow connector cannot be connected to the management interface.

 

 

Optional Tasks

The following are optional tasks when configuring sFlow.

 

Changing the Polling Interval

The polling interval defines how often interface counter data for a port is sent to the sFlow collector. If multiple ports are enabled for sFlow, the Brocade device will stagger transmission of the counter data to smooth performance. For example, if sFlow is enabled on two ports and the polling interval is twenty seconds, the Brocade device will send counter data every ten seconds. The counter data for one of the ports are sent after ten seconds, and counter data for the other port are sent after an additional ten seconds. Similarly, if sFlow is enabled on five ports and the polling interval is 20 seconds, the Brocade device sends counter data every four seconds.

The default polling interval is 20 seconds and it can be set from 1 to any higher value. The interval value applies to all interfaces on which sFlow is enabled. If the polling interval is set to 0, counter data sampling is disabled.

 

Changing the Sampling Rate

The sampling rate is the average ratio of the number of packets incoming on an sFlow-enabled port, to the number of flow samples taken from those packets. The default (global) sampling rate can be changed and it can also be modified on an individual port. The default sampling rate depends on the speed of the interface;

 

Interface Speed

Sampling Rate

10Mbps

256

100Mbps

512

1Gbps

1,024

10Gbps

2,048

100Gbps

8,192

   Table 1: Default Sampling Rates

 

 

Configuration Considerations

The sampling rate is a fraction in the form 1/N, meaning that, on average, one out of every N packets will be sampled. The sFlow sample command at the global level or port level specifies N, the denominator of the fraction. Thus a higher number for the denominator means a lower sampling rate since fewer packets are sampled. Likewise, a lower number for the denominator means a higher sampling rate because more packets are sampled. For example, changing the denominator from 512 to 128 increases the sampling rate because four times as many packets will be sampled.

Detailed Configuration Information

For full details of the sFlow configuration options please refer to the FastIron Configuration Guide in the references below.

 

References

 

Deployment Configurations for Special Applications

 

General Network Monitoring

 

Real-time monitoring of changes in traffic patterns, congestion and packet loss are an essential part of managing any network and is the only way to escape the break-fix cycle that is so often to cause of unnecessary outages. By highlighting problem areas the health of the network can be measured and remedial actions can be taken proactively thus ensuring that the applications and users that rely on it enjoy the best possible Quality of Experience.

The flexibility of integrated wired and wireless Ethernet networks together with application virtualization and end-user mobility means that traffic cannot be measured at a single point, for best results all ports within a network should be monitored. Continuous monitoring across all points in the network means that no incident will go undetected and reliable traffic data will always be available.

As highlighted previously the overhead that sFlow imposes is typically less than 0.02% even at aggressive sampling rates so the availability of network resources to support its deployment should never be an issue.

 

Billing and Accounting

 

Detailed network usage information is needed to fairly charge for network services and to recover the costs of providing value-added services. sFlow data can be used to account and bill for network usage at varying degrees of granularity and can also be used to provide customers with an itemized breakdown of their total traffic, highlighting top users and applications. This information gives the customer confidence in the fairness of the charges and allows them to identify areas of improvement and control costs.

In most billing systems the objective is to determine from all the packets crossing the network during the fixed billing period how many packets originated from a particular source. Because sFlow is a sampling technology it is important to optimize the sample rate in order to meet the desired level of accuracy and this is done by varying the number of samples collected. The accuracy is a function of the number of samples collected and the duration of the monitoring period. The graph below highlights the error reduction as the number of samples increases.

 

BestPractice_SFlowMonitoring-SamplingRate.jpg

   Relative Sampling Error

 

Further improvements in accuracy can be made by calculating the “confidence level” which is the width of the error window. In order to ensure that customers are not overcharged then the data point at the lower end of the confidence level should be used and this is a direct function of the number of samples collected and this can be calculated thus;

            Percentage Error ≤ 196 x square root
                where c is the number of samples.

For example, if the number of samples is 10,000 then the error will be approximately ±2% so the amount charged for should be 98% of the total number of packets counted.

It is important to note that the accuracy of measurement does not depend on the total number of frames, but simply on the number of samples used to make the measurement. This property is very useful when monitoring high-speed switches or routers as the sampling rate, the percentage of the total number of frames sampled, can be reduced as the speed increases. This is reflected in the default sampling rates used by Brocade devices (see Table 1 in the deployment section above). Furthermore, accuracy can be improved by simply increasing the duration of the sample period until the required number of samples has been collected.

Because the accuracy of the sFlow measurement is dependent only on the number of samples then increasing the accuracy is a relatively simple task and any errors due to lost samples will be negligible. For more detail please refer to the Packet Sampling Basics white paper at sflow.org.

References

 

Security Applications

A significant proportion of network attacks are initiated by “insiders”, employees or others that have an otherwise legitimate reason to be using a company’s network, while external attacks from sources such as the internet are also a constant threat. A comprehensive security strategy involves protecting the network from external and internal misuse and information assets from theft.

Since attacks and security threats will come from unknown sources, effective security monitoring requires complete network surveillance, with alerts to suspicious activity. sFlow provides the required visibility and audit trail for the whole network. The continuous network-wide surveillance and route tracing information provided by sFlow allows internal and externally sourced security threats and attacks to be rapidly traced and controlled.

By giving visibility into real-time and historical network-wide usage, sFlow can be used to prevent intentional attacks, minimize unintentional mistakes, and protect information assets.

There are four main network security threats:

Reconnaissance:  Probing or mapping the network to identify targets, examples are ping sweeps and port sweeps; both of which are usually a precursor to an actual exploit attempt.

Denial of Service (DoS):  Attempts to consume bandwidth or computing resources in order to prevent normal operation. A Distributed DoS (DDoS) attack is very similar, except that the attack appears to originate from multiple

Exploits:  Attempts to gain access to, or compromise, systems on the network. Often seen as repeated login failures or TCP hijacking.

Misuse:  Attempts to violate organizational policy, for example using disallowed services or including unauthorized content in e-mail or ftp transfers.

In order for attacks to be detected and responded to promptly, an intrusion detection system should meet certain requirements:

  • Network-wide, continuous surveillance
  • Reliable, timely availability of data, especially during network overload, which is common during an attack
  • Interpretation of traffic patterns
  • Alerts to violations or threats

There are some basic techniques for detecting and diagnosing security threats that can be applied by looking at traffic patterns:

  • Look for the Top N hosts associated with suspicious traffic
  • Look for changes in traffic patterns, for example use of new services or new users of services
  • Use of historical traffic patterns to explore the extent of a threat

Brocade Network Advisor uses Snort® in order to detect exceptional events and trigger remedial actions. Snort is an open source network intrusion prevention and detection system (IDS/IPS) that monitors network traffic in real time and on detecting dangerous payloads or other abnormal behavior sends an alert to the syslog which can then trigger remedial actions such as network changes (e.g. disable a port) or external alerts e.g. emails, text messages, etc.

Snort uses the Pcap data format which is described in the Exporting sFlow Data section below.

More details on the analysis of sFlow data in security applications is available in InMon security paper in the References section below.

 

References

 

Collecting and Analyzing sFlow Data

The networks ability to collect and forward data for analysis is only part of the solution, the real value is delivered when that data is collated and turned into meaningful reports for deeper analysis. As stated previously Brocade Network Advisor can act as a sFlow data collector and analyzer which offers standardized reports which include the following:

  • Layer 2 reports: MAC addresses and VLANs 
  • Layer 3 or Layer 4 traffic reports: All Layer 3 protocols, IP, IPv4, and IPv6
  • IPv4 and IPv6 reports provide the following additional reports:
    • Top users of all Layer 4 protocols
    • Top TCP Talkers
    • Top ICMP Talkers
    • Top UDP Talkers
    • Top talkers for all services.
    • Top talkers for all Layer 4 protocol services excluding TCP, UDP, and ICMP
    • TCP reports: Invalid TCP flags and valid TCP flags
    • BGP path reports

In addition to the standard reports BNA includes the ability to create highly customized reports which allow very precise analysis of the collected data with filters that include:

  • Device
    • Inbound port
    • Outbound port
  • Layer 2
    • Source and/or destination MAC addresses
    • Source and/or destination VLAN
    • Inbound and outbound QoS priority
  • Layer 3
    • Protocol
    • Source and destination addresses
    • TOS/DSCP values
  • Layer 4
    • Source and destination ports
  • Routing information
    • Source and destination subnets
    • Local and source AS
    • AS paths
    • Flow label
  • Time duration or span of reports
    • Relative time specifies a duration up to the point of report creation e.g. the last hour or day
    • Absolute time allows a specific time period to be defined
  • Report format
    • Table and/or chart
    • Table and/or chart format and data ranges (e.g. top 10, bottom 10, etc.)
    • Byte or frame counts

Custom reports can be scheduled to run at specific times or intervals so that they are completed as background tasks and do not have to be triggered manually.

 

Displaying sFlow Data Using Brocade Network Advisor (BNA)

Within BNA dashboard “widgets” can be used to highlight key areas of interest on the main application dashboard so that they are always visible and quickly accessible. BNA includes the following preconfigured status widgets:

  • Access Point Status Widget: Pie chart view of access point devices categorized by operational and reachability status
  • Events Widget: Bar chart view of events grouped by severity and range
  • IP Inventory Widget: Stacked bar chart view of IP devices grouped by operational status and selected category
  • IP Status Widget: Pie chart view of IP devices categorized by operational and reachability status  
  • Status Widget: List view of various status attributes

For larger scale data analysis and viewing of sFlow reports the BNA Performance Dashboard simplifies the task of customizing performance monitors specific to your needs. Up to 100 customized performance monitors can be defined and the system can display up to 30 at a time.

 

Exporting sFlow Data

For external analysis the sFlow data can be exported in raw form or converted into the standard Pcap (Packet capture) format, which is understood by a variety of open source products which can then provide additional tools to detect anomalies and defend against network attacks. The Pcap data format is used by the Snort application which is embedded within BNA and can be used to perform intrusion detection and prevention.

Comments
by jkemery1
on ‎05-22-2013 08:23 AM

Brook,

Great write up on sFlow! For Version 2 consider adding some screenshots of actual sFlow widgets in BNA. Something like a BGP AS-PATH report and interface stats would be helpful.

Thanks,

James

by fhameed
on ‎05-22-2013 04:30 PM

Nice white paper. sFLow is a key differentiator on Brocade switches. May be you can add sFlow vs Netflow comparison table.

by Simon Pollard
on ‎05-23-2013 06:54 AM

Hi James, Great idea. I did look in the BNA config guide but couldn't see any suitable images so will get access to a system and see if I can generate some.

Simon

by Simon Pollard
on ‎05-23-2013 06:57 AM

Hi Faisal, There is an sFlow vs Netflow comparison in the white paper posted on the brocade.com sFlow page.

Simon

Contributors