Next week, arguably one of the biggest SDN events will be held in Denver, Colorado. At this event there will be more live demonstrations of SDN applications and solutions by organizations actually using SDN than there have been at any other event to date. A vibrant community will come together for nearly a week to openly share technology and ideas with one another for the betterment of
each organization scratch that, mankind. If you have an interest as to where the world of networking is going, then you should pay attention to what is happening at Supercomputing 2013 (SC13). For the last 25 years, the Supercomputing Conference has been the premier event for the High-Performance Computing community, attracting members of research and education networks, universities, national laboratories, and other public and private research institutions. It is at this event, and really this event only, that you can get a preview of what networking may look like in 2, 3, maybe even 10 years down the road.
These are exciting days in my Forwarding Abstractions Working Group (FAWG); the past 15 months of work is just now producing results that will be instrumental in enabling our SDN future. Past is meeting future in my extended present, and each week brings us closer to fruition. Some weeks ago, FAWG produced a “preliminary final” document about our framework enhancements for the OpenFlow ecosystem.Read more...
The true value of SDN will come from the real-world applications that it will enable. Although at times, when listening to vendors describe their SDN strategies, it seems like we are hearing more about solutions in search of a problem, and not the other way around. Today I am happy to be part of an announcement that is not that.
For service providers, the onslaught of data on their networks has made them re-examine and re-examine again how to build a more efficient network. The goal is to have a dynamic network infrastructure at all layers to not just handle the massive amount of traffic, but to also unlock new revenue streams by providing premium services with greater SLA’s to customers.
Today, Brocade is excited to be part of an announcement of a successful demonstration of such an SDN application. In collaboration with Infinera and ESnet, this demonstration shows how SDN can be used to provision services and automatically optimize network resources across a multi-layer network as traffic and service demands change. By leveraging an application developed by ESnet, called OSCARS, we were able to show two use cases:
Originally a research project by ESnet, OSCARS has helped scientists collaborate with one another from around the world by moving massive amounts (in petabytes) of mission-critical data generated from research and experiments. Now with the demonstration announced today, this provisioning can be done at both the routing and transport layer using SDN. You can see the details of the demo setup in the diagram below.
This demonstration also exemplifies the power of an open SDN ecosystem. One of the major tenants of SDN is to unlock new levels of innovation for network operators. By establishing APIs between layers, software components and best-of-breed network elements this can ignite innovation, enabling the development of customized applications.
This week the demo will be on display at SDN & OpenFlow World Congress in Bad Homburg, Germany. If you are there, please stop by the Infinera booth #39 to see it live. Also check out the Brocade booth #41 where you can learn more about this solution in addition to other Brocade SDN and OpenFlow solutions. If you aren’t lucky enough to be in Germany, we will be showcasing the demo during a DemoFriday hosted by SDNCentral at 9am on November 22, 2013. Stay tuned for more details on this live virtual event where you will have a chance to hear commentary from Brocade, Infinera, and ESnet, see a live demonstration, and have time for questions and answers.
Last October the world’s largest Telco Service Providers collectively published the now-famous paper titled, “Network Functions Virtualisation,” which is serving as a call-to-action for the industry. Less than a year later, Brocade announces our next-generation Vyatta vRouter, the 5600.
The industry-leading Vyatta 5400 vRouter has been in production for years and is deployed worldwide. It’s a fantastic solution for multitenant workloads and is deployed in some of the largest clouds, including Amazon, Rackspace and SoftLayer. But the new Telco-driven NFV demand is different. It requires a new level of performance from a virtual router.
This is a critical business issue. The NFV movement is in pursuit of a tremendous boost in network agility in order to enable Telcos to stay competitive. They need an order of magnitude improvement in their time-to-market and adaptation of service offerings to rapidly changing demand dynamics. They also need to get their infrastructure down to an entirely new cost model.
To get there, Telcos will begin deploying substantial parts of their network infrastructure on industry standard x86 servers. If you haven’t looked close lately, the servers that Telcos will be using are the most network-centric the world has ever seen. The NFV business case assumes that software can take advantage of this modern hardware.
This is the business value of the new Brocade Vyatta 5600 vRouter. Re-architected specifically to leverage Intel’s latest and greatest, the 5600 is the world’s first virtual router for NFV workloads. With speeds that are a full 10x faster than our popular 5400 model, the 5600 can unleash the power of incredibly cost-effective servers and deliver customers a solution that enables them to meet their strategic goals of radically higher agility and lower cost.
It’s not a coincidence that the 5600 is following closely on the heels of the NFV movement; it is proof that Brocade is listening to customers and is aggressively delivering solutions to meet their rapidly changing needs. As Vyatta we invented the virtual router category; as Brocade we're rapidly taking it to new heights.
Today, Brocade unveiled its newest product targeting the Network Functions Virtualization (NFV) movement with the introduction of the Brocade Vyatta vRouter 5600 family. Featuring advanced routing capabilities including BGP and OSPF, the Brocade Vyatta vRouter 5600 is the world’s fastest virtual router with performance up to 10 times that of the industry-leading Brocade Vyatta vRouter 5400 series and is more than 40 times faster than competing products.
With the unique ability to deliver hardware-like performance as a software appliance, the Brocade Vyatta vRouter 5600 enables telecommunication and large service providers to significantly reduce capital (CapEx) and operating (OpEx) costs throughout key areas of the network, without a reduction in performance.
At the foundation of the Brocade Vyatta vRouter 5600 is the company’s vPlane™ technology, a highly-scalable forwarding plane capable of delivering more than 14 million packets per second per x86 core, the equivalent of 10 Gb/s throughput.
A core component of the “On-Demand Data Center™” strategy from Brocade and a key element of SDN architectures, NFV is a call-to-action from customers to convert and consolidate services traditionally delivered through proprietary, purpose-built hardware into virtual machines (VMs) running on industry standard high-performance servers. First adopted by Cloud Service Providers, including some of the largest such as Amazon, Rackspace and SoftLayer, the Vyatta vRouter is the most widely used NFV element in the world.
The Brocade Vyatta vRouter 5600 is currently in limited availability and open to qualified organizations. General availability is scheduled for the end of 2013.
 Based on 250 Mbps performance claim listed on Cisco Cloud Services Router (CSR) 1000V data sheet as of September 2013
When we founded Vyatta in 2006 the idea of using an x86 server as network infrastructure was pure folly to just about everyone. I’ll never forget the ridicule of one naysayer who in 2008 was quoted in the press as saying, “I would NEVER put a PC in my network.”
As it turns out, the world’s largest semiconductor vendor had plans that guy didn’t know about. Big, non-obvious plans executed quietly over the years to avoid attracting too much attention… a silicon crocodile easing itself toward the beach…
Intel played from their strength as the industry’s dominant processing platform, slowly absorbing networking as a native workload into that silicon architecture. With each turn of their x86 foundry, servers became better packet-processing machines. Those servers kept shipping in mass volumes, and in a short period of time Intel had blanketed the world with Network-Centric Servers.
(Approximate Release Date)
% of Servers Shipping w/ 10Gb/s NICs3
Putting this throughput into context, 14.4 million packets at 64-byte size is line-rate 10Gb/s performance. So the faster packet-processing innards quickly red-lined the basic 1Gb/s NIC, making it the new bottleneck. By next year it’s estimated that the average server will be shipping with 10Gb/s NICs.
And there you have it: The Network-Centric Server. They’re here, they’re incredibly low-cost compared to proprietary network boxes and customers have already begun to leverage them aggressively.
The adoption started becoming obvious a couple of years ago and now its hit deafening levels. Tier 1 Cloud Providers - such as Rackspace, SoftLayer and Amazon - are delivering Network-as-a-Service powered by software infrastructure on servers. All of the world’s top Telcos have formally stated their demand for NFV, which is in essence networking software running on servers. Enterprises are also starting to make the shift. And just in case you missed it, last week at VMworld, VMware finally made it clear they’re reaching for Cisco’s artificially inflated wallet.
The Network-Centric Server is about more than just slashing 90% of the CapEx out of the system. From the strategic / macro perspective, it ushers in a decomposition of system architectures in the same way that happened to compute in the 1990s. That technology disruption shattered and reconstructed an industry three times larger than the network industry. Now it’s our turn.
The crocodile’s now on the beach… and it will not be denied its meal.
1 Approximate, based on figures from Intel and Vyatta testing
2 Not yet disclosed, but hold onto your hat
3 Source: Dell’Oro
I saw an interesting article today about Internet2 that I wanted to share: http://www.healthcare-informatics.com/article/conn
Research institutes are, by design, on the cutting edge of many fronts, and data generation is no exception. Researchers are producing data in petabyte-levels today, and that number is growing exponentially. While this is great news in many ways (more data paves the way to more results), it’s creating challenges when it comes to actually transporting this data and collaborating with other research institutes. Existing Research and Education Networks (RENs) are not equipped to deal with today’s data requirements, meaning that researchers are forced to discard over 90% of data generated(!).
That’s an unacceptable number, so fortunately RENs are taking action to address these trends. Internet2, an advanced networking consortium for the research and education community, is leading the charge. In the article, Robert Vietzke of Internet2 shares three key points that he sees as necessary for supporting high-data research: 100 GbE, security and specifically addressing big data flows, and SDN. We at Brocade completely agree, which is why we partnered with Internet2 to build out their 100GbE SDN backbone using high-performance Brocade MLXe routers. The MLXe enables zettabyte levels of data transport and increased performance through full wire speed 100GbE and leading 10 GbE, 40 GbE, and 100 GbE density. And with hardware-enabled Hybrid Port OpenFlow, RENs can run OpenFlow at full wire speed up to 100 GbE on the same ports as traditional traffic. That means implementing OpenFlow without affecting performance.
The research space is a fascinating one. Pushing the boundaries to drive innovation is something I definitely support. Keep an eye out for more!
There has been a great deal of discussion of late of the relative merits of overlay approaches to SDN vs methods in which network hardware has a more active role to play. The arguments go back a year or more, and are fundamentally rooted in varying assumptions about the primacy of the hypervisor in modern data center architectures.
First, a bit of history. In August, 2012, several Nicira founders published a paper exploring the role of a physical network fabric in an SDN architecture. The paper observed that OpenFlow, and more broadly SDN, in its then-current instantiation didn’t actually solve some fundamental networking problems: most notably, it didn’t do anything to make network hardware any simpler, nor remove the dependency of the host on behavior in the network core. The proposed solution was effectively ‘smart edge/fast, dumb core’, though there were also two key observations that blunt the problems with that oversimplification.
However, as the term “network virtualization” entered the discussion, the temptation quickly became to discuss overlays as though they are completely analogous to server virtualization. Joe Onisick did an excellent job of unwinding that analogy, and I’d encourage you to read his post in its entirety, as well as the comments. His key point was this:
The act of layering virtual networks over existing infrastructure puts an opaque barrier between the virtual workloads and the operation of the underlying infrastructure. This brings on issues with performance, quality of service (QoS) and network troubleshooting…this limitation is not seen with compute hypervisors which are tightly coupled with the hardware, maintaining visibility at both levels.
In other words, network virtualization *still* does not address the fundamental problem of overly complex, rigid and manual physical infrastructure, nor prevent interdependent (physical-virtual) failure. In fact, it adds complexity in the form of additional layers of networks to manage, even while it simplifies the configuration and deployment of specific network services to specific clients. (In the absence of hardware termination, it is also unavailable as a solution to a broad swath of non-virtualized workloads, a problem which VMware is moving swiftly to address.)
Where does this leave us, then? The earlier articulation of the distinct purposes of the core and edge is important. True fabrics such as VCS explicitly address the need for simpler physical network operations through automation of common routines, as well as a more resilient, low latency and highly available architecture—a fast, simple, highly efficient forwarding mechanism. However, precisely because the physical network needs to be able to operate and evolve independently of what goes on at the software-defined level, fabrics can not be “dumb”. The individual nodes must in fact have sufficient local intelligence as well as environmental awareness to make forwarding decisions both efficiently and automatically. Not all fabrics are architected thus, but VCS fabrics have a shared control plane as well as unusual multipathing capabilities that allow them to function largely independently after initial set-up. There can also be utility in horizontal, fabric-native services that may be different from those deployed at the edge, or which may in some use cases be simpler to deploy natively.
VCS fabrics also maintain visibility to VMs of any flavor, wherever they may reside or move to, as well as mechanisms for maintaining awareness of overlay traffic, restoring the loss of visibility highlighted by Onisick. In addition, the VCS Logical Chassis management construct provides a much simpler means of scaling the host-network interface. Although VCS fabrics are actually masterless, the logical centralization of management allows the Logical Chassis view to serve as a physical network peer to the SDN controller, while providing the SDN controller a means of scaling across many fabric domains (each domain appears as a single switch), vs a plethora of interactions with each individual node. (NB I'm highlighting some of the specifics of VCS fabrics for the sake of concrete illustration, but broadly speaking, similar principles apply to other fabrics.)
Where many disagree with the Nicira stance is in the claim that an ideal network design would involve hardware that is cheap and simple to operate, and “vendor-neutral”, eg easily replaced. I would argue that what matters in terms of network portability is not that hardware needs to be indistinguishable from one vendor to the next—rather, it needs to be able to present vendor neutrality at a policy level. Hardware performance and manageability absolutely continue to matter and remain primary purchasing criteria assuming equivalent support for higher-level policy abstraction.
Or as Brad Hedlund observed over the weekend:
The ONF recently announced the formation of a Chipmakers Advisory Board (CAB) to help ONF leadership “grok” the chipmakers’ world view such that the ONF can more effectively encourage OpenFlow support in new networking silicon. After a call for applications, the ONF selected thirteen individual from thirteen member firms that design their own chips. The firms include merchant silicon vendors, network processor vendors and system vendors that use internal ASICs for their systems. As an ASIC designer, I am both honored and excited to be able to represent Brocade on the CAB. The honor seems self-evident; the CAB is a remarkable group of people gathered to positively influence the organization at the core of biggest disruption in networking since the Internet. The excitement comes because the formation of the CAB highlights the ONF’s readiness to aggressively pursue broad adoption of OpenFlow.
Here are some of the leading challenges that I see the ONF and the CAB working to address in the coming weeks.
Aligning the Hardware / Software Camps
I’ve written before about the tension between the software-centric and hardware-centric camps in the OpenFlow community. Not surprisingly, the dominant organization in the Software Defined Networking advocacy group (the ONF) has a heavy presence of software folks. The CAB, on the other hand, is definitely hardware dominant. The ONF, by creating the CAB, is aggressively seeking to bridge the hardware / software understanding gap, which is key to enabling a robust OpenFlow ecosystem.
The customers that I’ve spoken to have a strong interest in products that are OpenFlow-capable, but which also support a rich collection of legacy protocols. Such hybrid devices are essential to providing a realistic transition picture for SDN deployment. Without hybrid, customers would face high stakes binary architecture / purchase decisions at a time when applications are immature. Without a transition story, few buyers can justify the risk, and the market can stall before it really gets started. Without a healthy early market, it’s tricky to develop the apps, standards, interoperability and various other elements that characterize a robust market. A chip that supports hybrid boxes provides vendors with win-win: access to the small-but-growing OpenFlow market while also serving the larger established legacy market. OpenFlow-optimized silicon will be much easier to invest in down the road.
Today’s networking silicon is diverse for a variety of reasons. OpenFlow Switch 1.0 was simple and supportable across a wide variety of chips because it depended only on basic common features present in most platforms. But newer versions of OpenFlow go beyond those most common features. Many newer OpenFlow features are optional. That broadens the range of what OpenFlow can support, but it also makes implementations more diverse and makes it trickier for app architects to know which features they can count on. This “interoperability challenge” was highlighted to some degree at the most recent ONF plugfest at Indiana University. This plugfest included multiple products with preliminary support for OF-Switch 1.3, and the diversity of implementation has been a topic of discussion. The Testing and Interoperability Working Group (TestWG) is working up a technical paper as they have for previous plug-fests.) The Forwarding Abstractions Working Group (FAWG, of which I’m the chair) is working to mitigate challenges related to platform diversity, though some interoperability challenges, such as management, are outside the scope of FAWG.
Configuration and Management
The Configuration and Management Working Group (CMWG) defines the OF-Config protocol, including OFC1.0 and OFC1.1 (OFC-1.2 is in process at this time), but the adoption of OF-Config is behind that of OF-Switch. Compounding matters, the leading virtual switch, OpenVSwitch, uses OVSDB as a management protocol. ONF has not yet specified a management protocol as part of the OpenFlow Conformance Program, so a conformant product could use OF-Config, or OVSDB or some other management device.
The ONF established some aggressive milestone dates for the CAB, and the members are attacking the deliverables with gusto. The CAB’s first opportunity to “synch up” with other ONF leadership comes on August 7, when the CAB, the Council of Chairs (CoC) and the Technical Advisory Group (TAG) will meet for the first such face-to-face conversation. (The CoC and the TAG have already had several face-to-face meetings; they are very helpful.)
Today’s complex chips require 12 to 18 months to develop; revolutionary architectures are more toward the long end of that spectrum. Add the time required to productize, test, train, etc, and we’re looking at 30 months from project start until real revenue begins. With tens of millions of dollars required to get a new chip to market, firms require confidence that the spec is stable and that demand two to four years out will be big. On the other hand, hybrid boxes based on existing or evolutionary chips cost less, are less risky, and provide a compelling transition story. The CAB will be working to explain how its members view all these topics. I’m confident that the resulting alignment will accelerate the delivery of compelling solutions.
Image credit: File:Cabs.jpg - Wikimedia Commons
In my previous blog, I briefly discussed the Network Functions Virtualization (NFV) movement and the reasons why traditional Service Providers (SPs) are adopting NFV. One of the major qualifications for NFV will be to achieve relatively higher virtualized performance and scalability similar to what is now considered the norm in the hardware paradigm. Among the various trends that will define I/O virtualization, the two most distinct ones for consideration in NFV are:
In the virtualized environment, Layer 2 (L2) performance and throughput matches line rate even at 10G due to the L2 switching performance improvements for the virtual switches. With NFV and in this blog, the performance discussion is with reference to Layer 3 (L3) services - firewall, routing, load balancers, etc. The quest is to find how the higher packet throughput can be achieved when a combination of such services are configured on the virtual networking device, i.e. a virtual machine that supports services such as firewall, routing, load balancing, to name a few.
The performance degradation for L3 services in the virtual environment can be attributed to the components that sit between the Virtual Machine (VM) interface and the server’s hardware Network Interface Card (NIC).
Figure 1: A generic depiction of the layers between the VM interface and physical NIC.
These layers generally consists of the hypervisor’s virtual switches, system level drivers, hypervisor OS drivers, and so on. I/O techniques such as Peripheral Component Interconnect passthrough (PCI passthrough) and Single-Root I/O Virtualization (SR-IOV) are ways to bypass these components and tie the NIC directly to the interfaces of the virtual machine for higher throughput and performance.
Let’s look at SR-IOV as an I/O technique with Intel’s architecture to achieve the target 10G performance for NFV. SR-IOV is defined by a PCI-SIG specification with Intel acting as one of the major contributors to the specification. The main idea is to replicate the resources to provide a separate memory space, interrupts, and DMA (Direct Memory Access) streams per VM, and for each VM to be directly connected to the I/O device so that the main data movement can occur without hypervisor involvement.
In traditional I/O architectures, a single core has to handle all the Ethernet interrupts for all the incoming packets and deliver them to the different virtual machines running on the server. Two core interrupts are required - one to service the interrupt on the NIC (incoming packet) and determine which VM the packet belongs to, and the second on the core assigned to the VM, to copy the packet to the VM where the packet is destined. This results in increased latency as the hypervisor handles every packet destined for the VMs.
Figure 2: SR-IOV yields higher packet throughput and lower latency
To achieve some of the stated benefits in Figure 2, SR-IOV introduces the idea of a Virtual Function (VF). Virtual Functions are ‘lightweight’ PCI function entities that contain the resources necessary for data movement. With Virtual Functions, SR-IOV provides a mechanism by which a single Ethernet port can be configured to appear as multiple separate physical devices, each with its own configuration space. The Virtual Machine Manager (VMM on hypervisor) assigns one or more VFs to a VM by mapping the actual configuration space to the VM’s configuration space.
Figure 3: A physical NIC is carved into multiple VFs are assigned to a guest VM (Intel).
When a packet comes in, it is placed into a specific VF pool based on the MAC address or VLAN tag. This lends to a direct DMA transfer of packets to and from the VM bypassing the hypervisor and the software switch in the VMM. The hypervisor is not involved in the packet processing to move the packet (between the hardware interface and the actual VM) thus removing any bottlenecks in the path.
SR-IOV is supported on most hypervisors such as Xen, KVM, HyperV 2012, and ESXi 5.1. Newer servers and 10G NICs with SR-IOV support is required (and virtualization is a must for NFV). While SR-IOV is one way to achieve high packet throughput, the actual deployment will give rise to discussions regarding hardware support (hypervisor, NICs), traffic separation, etc. Another issue is that even though the physical NIC can be carved into multiple VFs (number depends on the NIC and the hypervisor), there are practical limitations on how many VMs can be deployed to share this NIC. Techniques for VM mobility while moving from one VF on one server to a VF on another server have to be explored. The open question is whether customers will require VM mobility or if we can assume that these VMs are immobile and tied to the SR-IOV server that they are initially deployed on. While SR-IOV promises to deliver high packet throughput, the exact nature of these improvements for L3 services will ultimately depend on a vendor’s software networking architecture or vendor-enforced license throttling.
Like NFV, SR-IOV is a new and exciting paradigm shift. In the months to come, there will be lots of activity to define the solutions and use cases that will lend itself to the deployment of virtual networking software. As NFV gathers momentum, it will be interesting to learn what the DevOps environment will look like; perhaps that will be the topic of my next blog.
A couple of questions I typically hear are, “What is SDN?” and “How does OpenFlow work”. To begin down the path of SDN enlightenment, it really helps to think about networking and “traffic” forwarding from a completely different perspective. I believe that most of us have been working with traditional Ethernet networks for a very long time and typically our views are etched into our minds based on this past experience. To answer these questions, it really helps to shift gears a bit and examine a basic network that can be built with OpenFlow. Please note the following examples are very basic and are not intended to showcase an actual solution, but rather highlight how OpenFlow works.
For the first example, let’s consider some very basis requirements – a research organization needs to map physical Ethernet ports to one another. For example, port 1 is mapped to port 3. All network traffic entering port 1 is forwarded out of port 3.
A simple web front end provides the user the ability to easily see existing mappings and add new mappings.
The second example simply builds on the first by adding in the concept of “global knowledge” or multi-device control. The same research organization needs to map physical Ethernet ports to one another. For example, port 5 on switch 1 is mapped to port 20 on switch 2. All network traffic that enters port 5 on switch 1 is forwarded out of port 20 on switch 2.
For the example assume the application has been "hard coded" with inter switch link information. This could easily be determined by the application with a routine that sends LLDP packets to be forwarded out all ports of all switches connected to the controller. A corresponding match all LLPD packets with an action to send to the controller would give the application enough information to determine the topology.
Let’s look how SDN in this example differs from classical Ethernet:
Start thinking how this type of architecture solves traditional networking problems more efficiently. Also notice how abstraction of the control plane could potentially provide greater flexibility AND simplicity when compared to traditional networking protocols. Up next - an emulated network (mininet) that demonstrates implementation of the examples highlighted above.
The prestigious Ivy League institute, University of Pennsylvania, is currently upgrading its campus core network and Shumon Huque of UPenn detailed the progress in his blog here. The blog provides an overview on the new network design, the benefits of the new solution, what they are replacing, and future plans for more upgrades and SDN.
I work on the product team at Brocade, so it is pretty cool to hear directly from a customer in a public forum about their network upgrade. It also gives similar organizations considering an upgrade an inside look into what a peer is doing.
Brocade has long been a supporter of higher-education and research networks. Our routing and switching solutions fit perfectly with the mission-critical high-performance computing conducted in research labs, as well as connecting the myriad of devices and users added to campus networks daily. Whether it is Indiana University building the first statewide 100 GbE network for research and education or, in this case, University of Pennsylvania deploying 100 GbE in its campus core, the Brocade MLXe routers have met their network demands for years. With a completely non-blocking architecture and line-rate performance, the Brocade MLXe solution ensures full application performance. Such performance enables universities to share a vast amount of data with devices distributed across the local campus and sites worldwide.
Now, with technologies like SDN, Brocade is bringing even more to the table for these types of networks. As Shumon mentions in his blog, the Brocade MLXe supports SDN with OpenFlow at 100 GbE. Today’s research and education networks are hot testbeds for this new and innovative technology. In fact, Brocade routers power Internet2, the nation’s first 100 GbE open, software-defined network, which UPenn connects into for access to research from labs and institutes around the world. I2 utilizes the programmatic control of OpenFlow for Big Data transfers on their backbone network.
Today, many research networks and universities are following suit to what UPenn is doing: building SDN-ready networks, and creating SDN testbeds to experiment. Shumon points out that, ‘SDN is still largely a solution in search of a problem’ for UPenn. It is true that with all the ‘SDN-washing’ going on in the industry, SDN has come to mean a lot of different things to a lot of different people. And, at times might seem more hype than anything else. However, for the research and education community SDN, and specifically OpenFlow, can solve real problems today. For example, the large flows (often called elephant flows) generated by scientific experiments and research consume a disproportionate amount of resources. As these come on to the network, having the ability to virtualize network resources with OpenFlow will improve network utilization and flexibility. The Brocade MLXe supports Hybrid Port Mode for OpenFlow as well, which makes it even easier to deploy OpenFlow on existing network infrastructure and use as needed while the rest of the traffic is routed as normal.
I look forward to hearing more from UPenn on their upgrade to 100 GbE on the MLXe, and hear what kind of SDN use cases they test.
Network Functions Virtualization (NFV) is a call to action for traditional Service Provider (Carrier Telcos) to adopt software virtualization and general purpose hardware especially for the networking pieces. The ideas for NFV are not new. What is new with NFV is the move away from proprietary hardware in some segments of their network.
Figure 1: Motivations for NFV
Virtualization is not a new concept; today Gartner estimates that more than 50% of the x86 server install base is virtualized. This trend is not limited to application virtualization: virtualized networking capabilities can also be deployed on non-proprietary server hardware and this is the crux of the NFV movement.
Virtualization provides the benefit of being able to host multiple applications on a single server, providing significant cost savings. However, for NFV the main impetus may not be ‘cost’ reduction although that is a major benefit of this approach. As an acquaintance inside one such large SP told me, SPs have millions to spend on traditional, proprietary gear. The reason, therefore, for the adoption of NFV is more for the speed with which services can be deployed due to the flexibility and agility of virtualization.
Traditional SPs are now looking at Amazon, Rackspace, and other Cloud Service Providers (CSP) and are working to adopt the CSP model without giving up their requirements for security, performance, and scale. The CSP model is a highly automated, highly orchestrated, virtualized system for deployment of Virtual Private Clouds. What makes the CSP model so attractive is its ability to provide services in innovative ways - and fast. CSPs can provide multi-tenancy in their Cloud Model with inbuilt security using VPN and firewalling. Most importantly, they are also providing Big Data tools to their customers for data analytics. This is not surprising, but indeed a smart move since they already collect tons of data. Their infrastructure gives them this ability to do data analytics natively and not as an afterthought which is characteristic of most traditional networks. For these reasons, the NFV movement is a validation of the CSP model.
Technical challenges for moving to NFV may pose a few obstacles. Dedicated and newer hardware is required to achieve faster speeds using SR-IOV (or PCI Passthrough). The need for dedicated servers stems from the fact that the NIC is tied to the interfaces of the virtual machine for higher throughput and performance (no virtual switches in between and therefore higher throughput). For maximized performance, newer servers with newer NICs that support SR-IOV are required. In addition to performance, higher Service Level Agreements (SLAs), stringent Quality of Service (QoS) requirements, and High Availability (HA) will be focus of the overall NFV effort.
An additional point to support NFV requirements is the idea that server and chip vendors can now provide reference architectures (e.g. Intel) and create a staging area for new innovation. NFV has the potential to create a new paradigm in the networking industry by enabling enterprises and SPs to speed up application and service delivery to their end customers. This is indeed a trend that is worth following.
When Rackspace launches a new cloud service globally, take note. As the open cloud leader, Rackspace knows what customers need. Now Rackspace once again advances the state of cloud computing with networking services on demand, powered by the Brocade Vyatta vRouter.
Getting your business onto the most open and flexible cloud platform has tremendous advantages. However, doing so without compromising the controls and security typically found in enterprise-class networking has traditionally been a challenge. Critical network services such as firewall, NAT, VPN and routing cannot be overlooked.
Now Rackspace has eliminated that challenge by enabling customers to take control of how their applications are segmented and secured in the cloud with the Brocade Vyatta vRouter, a software-based networking appliance that lets you leverage “open” to its fullest potential and provisioned on-demand. Powered by the industry’s most widely-deployed virtual router, Rackspace now offers on-demand networking and enhanced security for hybrid cloud environments.
With the Brocade Vyatta vRouter, companies can significantly reduce the risk of exposing cloud infrastructures to the outside world, whether to competitors, other customers or even different units within the company. That’s why it is absolutely critical that customers be able to connect and protect their infrastructure in a way that simultaneously gives them the controls they need, with the flexibility that the cloud affords.
More than 1,500 customers worldwide have deployed the Brocade Vyatta vRouter for on- and off-premise networking and security needs. Founded in 2006 and acquired by Brocade in 2012, Vyatta is the pioneer and market leader in enterprise-class virtualized networking. Like Rackspace, Brocade believes customers should have choice and not be artificially trapped by limited system designs.
Take to the Open Cloud on your terms. Now with on-demand networking services, Rackspace is making sure you can do it without compromising ease, economics – or control.
Brocade is at Interop, Las Vegas this year, sharing the latest on the On-Demand Data Center. This is my first Interop, so I decided that I’d keep a daily diary of the experiences of a booth exhibitor. I’ll be updating this blog throughout the event, so stay tuned for latest. In the meantime, if you’re at Interop, come visit me in the Brocade booth, #815! If you’re real nice to me, maybe I’ll even mention you in the blog.
Monday, 5/6: I’m here at Mandalay Bay and everything looks great. I’m getting excited to begin tomorrow! First thing I did was take the long walk over to the exhibition hall and register in the wrong section. Off to a good start! The booth is excellent. I did a walk around, and the theater and the Ethernet Fabrics and Vyatta vRouter demo spaces are set up and looking fancy. Now the space I’m most interested in (which has nothing to do with the fact that it’s my space), our More Information pod, is coming together quite well. The More Info pod sign, highlighting Brocade’s Network Functions Virtualization portfolio, OpenFlow solutions, and OpenStack plugins, turned out really well. The expo starts tomorrow, so today is all about preparing and getting acquainted with the space and schedule. As a hardworking booth exhibitor, I surely won’t have time to check out the pool or any such amenities.
Tuesday, 5/7: The expo hall opened today, kicking off the festivities. We had a sizeable crowd come through, everyone all bushy-tailed and bright eyed at the start of things. I had some great conversations with people about Brocade’s SDN vision and how we’re differentiating with Hybrid Port Mode and our hardware-enabled OpenFlow. A good number of folks are interested in our Vyatta vRouter as well. Overall throughout Interop, there is a substantial interest in SDN and Network Functions Virtualization. I’m looking forward to seeing the future developments in those spaces.
Wednesday, 5/8: The full expo day kicked off today, and it was a packed slate! I barely had time to grab lunch, but these are the sacrifices of a noble booth exhibitor I suppose. I had a lot of excellent conversations with people today, from discussing some use cases of SDN outside of the data center (like in the WAN, for example) to talking about the role of OpenStack and orchestration in networks. Our video guys stopped by the booth to check in on everything toward the end of the day and put me on camera for a short spell. Coming over at the end of the day was a bit unfair for my personal vanity, but hopefully the good stuff came across. We had quite the crowd gather when it was time to announce the five winners of our daily giveaway (five people who register the most steps on the pedometers we’re handing out win $100 each). There’s still one more round on Thursday, so if you're attending Interop and haven’t picked one up yet, stop by our booth and we’ll give you a pedometer. Hope to see you there!
Thursday, 5/9: The final day of the Interop expo came quickly, and it was a good one. I had a few great conversations about Brocade’s solutions in the data center core and interconnecting distributed enterprise data centers with the MLXe. Definitely some positive responses to Brocade’s VPLS load balancing capabilities. The biggest hit of the Brocade booth was likely the pedometer giveaways however. We had a great crowd gather to see our theater presentation and pedometer winners. I had to leave a little early to catch my flight (which was of course summarily delayed), but overall Interop was a lot of fun. I met some quality people and enjoyed talking about networking. Until next year!
On May 17 Brocade will participate in SDNCentral’s DemoFriday™. In this one-hour presentation viewers have the opportunity to see a live demo of how Brocade is taking an innovative approach to OpenFlow by implementing OpenFlow in Hybrid Port Mode on its high-performance routing platforms. This unique capability provides a pragmatic path to SDN by enabling network operators to integrate OpenFlow into existing networks, giving them the programmatic control offered by SDN for specific flows while the remaining traffic is routed as before. To register for this virtual event, please click here.
In anticipation of DemoFriday, we sat down with Matt Palmer of SDNCentral and asked him about the SDN and OpenFlow market, technologies, and ecosystems. Matt is the partner and co-curator of SDNCentral with over 20 years of experience in software-defined networking (SDN), cloud computing, SaaS, & computer networking.
Brocade: Today, OpenFlow is widely used in production environments by mainly the Research and Education community and some large cloud or Web 2.0 companies. What do you think needs to happen for OpenFlow to become more main stream?
Matt Palmer: Today, SDN (including OpenFlow) is an emerging technology which means the early adopters are organizations that are either a) looking to gain competitive advantage; or b) develop new capabilities – meaning that SDN generically – and Openflow specifically – are right where they should be on the customer adoption curve. For these early customers – each deployment is a custom project with custom software development. What the market needs in order to move to more tailored projects is for these early use cases to find applicability at more organizations. We are seeing this today, for example, we are seeing commonality emerge across customers for use cases such as network visibility and network service chaining and how OpenFlow may solve these problems for specific classes of customers. As these use case become more common – we’ll see packaged solution emerge where mainstream customers can buy them as solution instead of a custom build solution.
Brocade: In your SDN Market Forecast research you purposely did not size the market through Ethernet-port-based modeling by using OpenFlow as a proxy for SDN. This makes sense because there are a lot of other SDN technologies outside of OpenFlow. However, do you have any estimates on the market size for OpenFlow specifically?
Matt: We sized the market based on business impact, which we measured by looking at shifting customer-buying preferences. The benefit of looking at the market this was is that we can see how SDN as an emerging technology trend is impacting customer-buying decisions today and what it means in the future. Based on our end customer interactions, we see SDN having a significant impact on today’s purchase decisions. Specifically, we see a growing number of customers demanding that network infrastructure have committed support for SDN capabilities like programmatic APIs and hooks into orchestration systems like OpenStack. Like most protocols, we don’t see a reliable means to measure market size for OpenFlow – as in our experience customers are focused on capabilities that solve their problem – and not specific protocols. For example for you don’t hear people sizing the market for OSPF vs BGP– instead they measure the router market.
Brocade: Brocade is implementing OpenFlow in Hybrid Mode on its high-end router family, in both Hybrid Switch and Hybrid Port modes. In working with different network operators considering OpenFlow what do you hear as the general perception of Hybrid mode implementations in the market today? Is this the pragmatic approach?
Matt: What’s cool about Brocade’s Hybrid Mode is that you selectively choose which traffic to run in traditional switching and routing mode and which to apply OpenFlow primitives so that you can run OpenFlow over an existing production network. We see this this as especially important for the R&E community (like Internet 2) and we also see this capability being potentially relevant to WAN operators such as traditional wired service providers as well as mobile network operators.
An interesting, not well know fact, is through our sister company Wiretap – our software architects prototyped in 2012 a service chaining and tap aggregation application for one of our clients on a number of switching platforms, including the MLX. Back to your previous question – hybrid mode was one of those key featured needed before they could move forward – so as OpenFlow matures on platforms such as the MLX we expect to see customers looking to use it more for mainstream applications such as service chaining and tap aggregation which have broad appeal.
Brocade: What is your overall take on OpenDaylight? What do you think needs to happen for it to remain true to its mission to be open source? How do you see it fitting with the Open Networking Foundation, and ultimately its effect on OpenFlow?
Matt: OpenDaylight has the potential create a viable open-source SDN software ecosystem – which the industry was missing until OpenDaylight. That’s a net positive – the real measure will be to see if OpenDaylight can deliver on that promise which until there’s a) code ready for customer consumption; and b) vendor(s) who can support that code all of this is still theory. There’s significant promise, that’s all it is until the OpenDaylight team delivers the first iteration. We see the organizations with the biggest impact by OpenDaylight are the early SDN software start-up who’ve just had their entire business models turned upside down with a large industry consortium; it will be interesting to see how they evolve.
Brocade:OpenFlow is one of the leading and most widely known SDN technologies, but where do you see OpenFlow fitting into the SDN framework in the future? 5 years out? 10 years out?
Matt:Let’s be pragmatic – the SDN market and OpenFlow as a technology are still in their infancy and customers are still aligning on use cases. What we see for the next 12 – 36 months is OpenFlow continuing to evolve for specific use cases such as WAN traffic engineering, TAP aggregation, and service chaining with potential to also be adapted for things like optical transport. Though for at least the next 12 – 18 months most of those will still be custom deployments. More broadly what we see are the mega trends of cloud computing and virtualization, mobility with tablets and smartphones, and social computing is taking us from millions to billions of servers, mobile devices, and interconnections between those mobile devices and servers – creating a 3X- 5X order of magnitude change in the number of devices and connections on the network that’s forcing network operators to have to change how they build and operate their networks and drive SDN.
Brocade: Besides OpenFlow in hybrid implementations, what are some of the other ways you see SDN making its way into existing production networks within this year?
Matt: One obvious place is OpenStack and integration via the Quantum plug-ins. We see a lot of interest around integrating Quantum with the physical network. There is unlikely to be significant production deployment – though we are seeing enough interest that we are writing a report on the various vendor add-ons to the Quantum plug-in and deployment many of them in our lab for testing.
Brocade: As SDN becomes more mature, can you summarize your one piece of advice for customers when considering an SDN technology? And for vendors developing SDN solutions?
Matt: SDN Consumers: Validate, validate, validate – everyone is making claims and before you select a vendor – you need to understand what is reality and what is not as that relates to your timeframes. Go find a trusted advisor who’s ‘in-the-know’ and help you get started. That said, 2013 is the year of the pilot and POC – so identify a high value area where SDN may solve a business problem or create new opportunity and start learning. We are seeing 10X increase in number of prospective customers coming to SDNCentral to explore SDN – which means you competitors are likely already off and learning what SDN could do for them. If you don’t know who can help – ask us at SDNCentral and we can refer you to and SDN expert in your part of the world. Or email me at matt at sdncentral dot com.
SDN Producers: At the risk of sounding like you are SDN-washing – articulate the programmability capabilities you have, get in the game and fight to get into early field trials. Everyone is making this up as they go along – so while you can’t get too ahead of reality – you also need to paint to customers a future vision the articulates how you can help them achieve their objectives with your capabilities today and tomorrow.
Brocade: What inspired SDNCentral to offer DemoFriday’s?
Matt: We saw that accelerate mainstream adoptions – the market needed a place for enterprise, datacenter, service provider, and network architects and operators to see SDN technologies in action so they can start to learn how to apply SDN principles to their situation. We are excited about the Brocade DemoFriday™ because we are showing capabilities for Service Providers with the MLX and for the DataCenter with VCS and OpenStack.
I recently worked on a project that brought up some interesting questions. The customer was building a Hadoop cluster and wanted to test performance differences between traditional Ethernet protocols and OpenFlow. While this sounds pretty normal, the conditions surrounding the project made it a bit more complex. I worry that this post may expose how much I have yet to learn about OpenFlow, as well as Hadoop, but I’ve never been shy about putting myself out there. How can you learn anything if you aren’t brave enough to admit you don’t know everything. So, in the comments section, I encourage you to write what kinds of similar experiences you’ve had and if there were any resolutions or solutions that you came to.
The set-up starts off pretty straight forward: 40-node cluster, standard replication in triplicate, and top of rack switches. The details of the servers, drives, processors, or even performance ratios were not provided. I was basically given a picture of what looked like a cabinet of servers that had lines coming out both sides of them to two different top of rack switches. Those switches were uplinked to a legacy core switch on one side and the other side ended in an arrow pointing somewhere off the design. The question was, “Can we do this?” First, I have to explain that as a subject matter expert in Big Data, you get blind-sided by the strangest ideas at least a few times a week. The only benefit my background gives me is a place to start researching. Sadly, many think that the title gives me super-human ability to fire authoritatively from the cuff on ideas that have genuinely never been tried before, with no empirical evidence, with only a minute or two of lead time. I wish it did. I’m not complaining, after being an SE for 8 years, I enjoy the pace, diversity, and pressure. So, back to our story…
The two sides that were later explained to me were separate networks overlaid on top of each other. The left side of the server stack was NIC #1 (every server was dual-NIC’d, 1GbE, no IP binding) and it was connected to a normal ToR 1GbE switch that had a gateway in a “Stable” network where they would ingest the dataset, allow customer access for queries, and output the results. That was a production environment where downtime was frowned upon (Ok, the customer is a bank). The right line coming out of each server went up to an “SDN Switch” and on out through an “SDN gateway” and whether or not a data-set same in from that side, if there was user access from that side, or if outputs were to go out that network was unmentioned. However, this was the experimental side where they would be “courting risk”. Yes, the same servers are expected to run production for a bank and belong to an experimental SDN network separated only by the PCI express bus on the board. This is where solution designers either break down and start crying or get really creative. I must also point out that the local SE had read a LOT about Hadoop and had prepared a few good looking designs but since this was the first dual-interconnect Hadoop cluster that runs half on SDN, I didn't have anything to compare them to.
The idea of this cluster is to see if the OpenFlow mechanisms could out-perform traditional L2 networking in a loosely coupled cluster. This, in my own opinion, is a very interesting study. Think of the ramifications if this were the case. Companies could dual-NIC or use some kind of dual-mode configuration on all of their desktop machines and when everyone went home at night, SDN could build a cluster with all their idle resources and crunch all kinds of jobs in the evenings. It’s like a dual-purpose IT infrastructure. You could also dynamically link machines together in the wireless or cellular carrier space for crowd sourced cluster computing. Sure, these ideas exist today, like the SETI project, but not on this level, where the device is actually a cluster node for a larger running job. To a solutioneer and closet futurist, an epiphany like this goes instantly to the largest, most grandiose use cases imaginable. I’ll spare you the pie-in-the-sky ideas and get back to how this project ended up and what’s happening going forward.
So, I presented the file system problem that Hadoop cuts up files and stores them on member nodes with an IP address in the header, so it’s network specific. To my knowledge, and I get most of that from the O’Reilly book on Hadoop, is that for best reliability, Hadoop should run in one broadcast domain. Before you jump all over me and point out all the different configurations that you can adjust to have multi-domain clusters, I’ll just say that I know they exist and that “Best Practice” was to have the namenode and the datanodes in the same broadcast domain. This lead to the question, “Well, what would you be testing then?” If the default vLAN had all the addressing for the cluster and you used a protected vLAN configuration to put OpenFlow services in a separate vLAN (even if the interfaces were tagged), wouldn’t all the node-to-node traffic always take the default vLAN since that is where the file system exists? This would, in effect, drain any traffic from the 2x vLANs that you are trying to compare and you wouldn’t have any real data to make your observations of which network is faster.
This lead to a few more design ideas that included adding another distribution layer, IP bonding to the same ToR switch, and a few different multi-mode options before the customer decided to simply build two Hadoop clusters.
While the conclusion sounds anti-climactic, there is a reality here that has to do with pushing the boundaries of technology. Some of the lessons that ring in my ears are: If you’re going to have an experimental network, don’t couple it to something that your revenues depend on. Secondly, and probably more importantly, is that we’re all still creating. We all have to understand that Mr. Cutting wrote Hadoop in 2005 and didn’t have a production installation until 2008. It’s only 2013 and while we believe we can clobber mountains in a single keystroke, we have to keep perspective. All of the big names that define the current software revolution are very new and there’s a danger with trying to implement technology for the sake of efficiency without any concrete, historical evidence. When you do this, you are an innovator and on the bleeding edge. If that isn't where you wanted to be, maybe you should take a long look at what your strategy actually is. There’s nothing wrong with blazing a new path towards greater productivity but you have to know that it’s fundamentally different than implementing a tried and true technology into an innovative business process. The innovator brings different kinds of risk to the equation. If your goal is higher productivity through better use of technology and resources, you must allow that technology to find best practices in the world before you make plans that might not yet be possible.
I’m tracking this one closely as I’m keenly interested in the potential benefit of SDN controlled inter-connects for Hadoop. Maybe someone will come up with an application that injects paths based on traffic flow, size, current resource utilization, power consumption, and a host of other variables that propels cluster computing to yet another level of performance. What I do know is that it should be tested separately!
I look forward to your stories of similar environments where you may have been able to pull it off. Even if you weren't I think we could all benefit from hearing about what you went through. In the future, we will be hosting a Big Data social media page with its own Blog as well as public and private groups for sharing lessons learned and best practices throughout industry.
I am very excited today about Brocade's announcement of its continued commitment to OpenStack and of its delivery of solutions to our customers that need Open, Agile and Scalable Cloud networking architectures. And to help us get started off with a bang, we announced that seamless OpenStack integration is now available for Brocade VCS Fabrics: bit.ly/12XbJBf
Having worked with other SDN solutions as well, I have come to realize the importance of building a strong partner ecosystem upfront. Thankfully, Brocade has done a great job of partnering with the Best in class OpenStack vendors such as RackSpace, Red Hat and Piston Cloud. In fact, we have none other than Jim Curry, the GM of RackSpace Private Cloud (also the co-founder of OpenStack) discuss the value proposition.
Earlier this month at the OpenStack summit, I was on this panel moderated by Lew Tucker with fellow networking vendors and we discussed business models around Quantum (now called Networking Service). One thing that was clear was the usefulness of Quantum having both a pluggable and an extensible architecture that ensures that vendors could not only write extensions onto the framework where innovation is happening at the core but could also write plugins for their platform specific innovations and thus enhance their platform IP while contributing to the overall acceleration of innovation. Brocade does both to provide maximum value to our end customers. Our own Didier Stolpe does a great job of explaining this in this video with me.
But wait, there is more! This excitement is sure to continue: in addition to these solutions, we are also working on OpenStack solutions for FC SAN, Virtual ADX, ADX and Brocade Vyatta vRouter
It must be Spring; news is blooming everywhere.
This week there is a lot of coverage about Brocade’s On-Demand Data Center strategy. For that matter, the last three weeks have also held a lot of news on the topic of data centers and networking, including news from the 3,000-attendee OpenStack Summit and 2,000-attendee Open Networking Summit. Clearly there’s change in the air.
With change comes opportunity, and also challenge. The industry is not short on the hype that attends change, but the hangover from the hype is also true as customers begin to realize what’s real today and what’s not - a.k.a. Gartner’s “trough of disillusionment.” This is where Brocade is unique with a strategy that unites the physical and virtual infrastructure environment, and balances forward-leaning technologies with pragmatic approaches to usability and adoption.
A very large part of Brocade’s strategy is seen in the increasing investment we’re making in software across the board. There is a powerful industry dynamic I’ve written about here, and Brocade is very bold in this pursuit of software as a natural complement to physical networking hardware.
The advantages and role of software in the emerging data center address three major customer needs:
At the end of the day the tone coming from Brocade is increasingly “open” and “software,” which are powerful complements to our industry-leading hardware products. This is a strong power-move forward for customers, enabling them with advanced, modern technologies that are coupled with a pragmatic and evolutionary approach to SDN.
The data center network is increasingly On-Demand. Keep watching this space; the news has only begun…
The Multi-Service IronWare software release 5.5 for Brocade MLXe supports OpenFlow version 1.0 with a new “hybrid port mode” option. This is the first product in the industry to support OpenFlow hybrid port mode. OpenFlow hybrid port mode is supported as part of a normal software release. That is, OpenFlow hybrid port mode is available as a fully supported feature on the MLXe. This is not an experimental feature or a prototype for experimentation.
Brocade has supported OpenFlow “hybrid switch mode” since OpenFlow was first introduced as part of software release 5.4 in 2012. Software release 5.5 adds a new hybrid option Brocade calls “hybrid port mode.” Before we go into the details of hybrid port mode, let’s review what hybrid switch mode provides.
Brocade’s OpenFlow hybrid switch mode allows the user to enable OpenFlow on any desired port on the MLXe or CES/CER router platforms, while other ports on the same router can run any other supported features, such as IPv4/v6 routing, MPLS VPNs, etc. This means that the router can be divided into two sets of ports: ports enabled with OpenFlow and ports running other router features. An OpenFlow Controller connected to the router will only see the OpenFlow-enabled ports. For that reason, the OpenFlow Controller can only forward packets among the OpenFlow-enabled ports. Ports enabled in the “hybrid switch mode” cannot be configured with other router forwarding features such as L2 switching, VLANs, IPv4/v6 routing, MPLS VPNs, and etc. Thus, normal ports (i.e., those not enabled with OpenFlow) cannot forward packets to OpenFlow-enabled ports. In effect, the router is split into two routers.
Like hybrid switch mode, Brocade’s OpenFlow hybrid port mode allows users to enable OpenFlow on any desired port on the MLXe. However, the port can support other router features concurrently with OpenFlow. For example, the user can configure IPv4 routing (BGP, OSPF, or ISIS) on a set of VLANs on a port and enable OpenFlow on the same port. When a packet arrives at the port, the packet is first submitted to the flow table. If there is a match, the actions specified in the flow are executed. If the packet does not match any flow, the packet is submitted for normal forwarding. In this example, if the packet belongs to one of the configured VLANs, the packet would be routed. Otherwise, the packet would be subjected to the default OpenFlow action, i.e., drop or send to the controller, per configuration.
Did we say the user needs to configure VLANs to enable OpenFlow on the port? Absolutely not! With the Brocade implementation, OpenFlow is always enabled on the port, not on a VLAN or a set of VLANs. In fact, the user can enable OpenFlow hybrid port mode on a port before any other feature is configured on the port. In the previous example, the user decided to configure a set of VLANs for normal IPv4 routing.
Can OpenFlow hybrid port mode match on any VLAN id on that port? Absolutely! OpenFlow hybrid port mode is enabled on the port and can match on any VLAN ID on that port without regard to the existence of any VLAN configuration on that port. The reality is that VLAN configurations belong to the normal router features on that port and not to the OpenFlow configuration.
Now that you may be getting agitated with thoughts of cool and novel applications you can build with the Brocade hybrid port mode feature, here is one more twist. You can enable some ports on the MLXe in hybrid switch mode and other ports in hybrid port mode, while leaving other ports without OpenFlow configuration. This splits the router in three sets of ports (see figure below). All of these at line rate for 1G, 10G, 40G, and 100G ports. I guess it is redundant to say that packet forwarding is hardware based. Be careful and do your homework! Some other vendors support OpenFlow data forwarding in software, which is never line rate.
Some folks may be thinking… “if hybrid port mode supports OpenFlow lookups followed by traditional lookups, this must increase latency.” However, that is not the case. The Brocade hybrid port mode feature does not increase latency.
Brocade demonstrated the MLXe hybrid port mode capability at the Open Networking Summit from April 15th to 17th at the Santa Clara Convention Center (http://opennetsummit.org/). The Brocade hybrid port mode is already deployed by a customer on a nation-wide 100G production network supporting traditional IP routing underlay with OpenFlow overlay.
Brocade at the Open Networking Summit
Why hybrid port mode? Brocade customers requested support for this feature. Customers want to be able to create an OpenFlow overlay on top of existing production networks. The OpenFlow overlay would be used to support new premium services and SDN applications on top of the underlay network. As mentioned above, the Brocade hybrid port capability is already deployed in this way on a nation-wide 100G production network.
What does this mean for you? If you’re interested in taking a practical path to SDN, hybrid port mode is exactly what you’re looking for. With the Brocade hybrid port mode you do not need to create a separate network to realize the benefits of SDN and OpenFlow. You can deploy an overlay SDN/OpenFlow network leveraging your existing network. Brocade hybrid port mode is available as a software upgrade for the Brocade MLXe.
You may be thinking… “That seems risky, since I will be testing OpenFlow controllers and SDN applications on top of my underlay production network. What if there is a misconfiguration on the OpenFlow overlay and it drops my production traffic on the underlay network?” This is a valid concern. For example, if the OpenFlow controller pushes a flow to the router matching on any packet and the action is to drop the packet, the router would drop all packets, including underlay traffic. That can happen if you are not careful. Fortunately, Brocade has a solution for this problem. While you are testing OpenFlow controllers and applications, you can “protect” the underlay traffic. The Brocade hybrid port mode feature supports “VLAN protection”. With a simple configuration command you can protect a set of VLANs from being affected by OpenFlow. Packets arriving on a protected VLAN will skip the OpenFlow table lookup. VLAN protection is supported in hardware. That is, performance is line rate as usual.
Do I need to protect any VLANs? No. This is an optional feature. Whether you want to run the underlay traffic on protected VLANs or not is your choice. Why would I choose to allow OpenFlow to touch the underlay traffic? To support premium services using OpenFlow. For example, premium services such as traffic engineering and service insertion/chaining can be added to selected underlay traffic by allowing OpenFlow to touch the underlay traffic. Besides supporting testing and experimentation, is there a use case where VLAN protection is desirable? Yes. You may want to protect management VLANs.
Brocade hybrid port mode with optional VLAN protection provides a practical path to SDN.
Stay tuned for further innovations from Brocade!
I am drawing parallels with conversations I had few years ago on BYOD and last week on private cloud using OpenStack. Not on the technologies itself but how new shifts bring in new problems to be resolved and the innovation surrounding that.
In the late-2000s, I had the following BYOD conversation with a friend who worked in IT:
Me: Users bringing their personal devices to work should be an interesting trend.
Friend: We will NEVER allow users to bring in their own devices.
Me: But the mobile devices are..
Friend: What about SECURITY?!
As we all know, these were issues that were resolved (in most cases) and BYOD is here to stay.
And interestingly, just last week, I had the following conversation with another friend…
Me: A self-service portal for IT users – private cloud - should be an interesting trend in large enterprises.
Friend: But I don’t think IT admins would want to lose control.
Me: They can setup templates and they still have control. In fact they lose all control when some of these users go rogue.
Friend: But what about usage monitoring? What about internal BILLING?!
Agreed that there are still some issues that need to be fully resolve here but innovation is bound to happen where trends would force innovations to occur. In fact, there is an OpenStack project on billing called Ceilometer that I am eager to learn updates on at the OpenStack Summit next week.
OpenStack Summit is a great place for the community to showcase and learn various innovations happening in our industry. Talking of which, I am also very excited about Brocade’s participation at the OpenStack summit. We have the following speaking sessions:
Yours truly will be on the panel - Networking, the Final Frontier of the Software-Defined Datacenter (Monday 4:45 to 5:30)
Orchestration of Fibre Channel Technologies for Private Cloud Deployments (Monday, April 15, 2013; 9:50 a.m)
Panel Discussion: Enterprise Vendors in the OpenStack Ecosystem (Monday, April 15, 2013; 1:50 pm)
We also have a booth C20 where we are showcasing OpenStack demos with our VCS and ADX products and we can discuss the innovations in IP networking and FC SAN.
So, come see us at the OpenStack Summit and see how Brocade solves our customers’ problems. And yes, let us also jointly witness at the summit all the other cool innovation our community is bringing to the table.
In just five days, the third Open Networking Summit kicks off in Santa Clara, California. Brocade will be there, will you? Whether you are attending or not (maybe you’re just waiting for the presentations to be made available online…it’s a lot cheaper…), I want to give a preview of what types of things you can expect to hear and see from Brocade in the halls of the great Santa Clara Convention Center.
In preparation for this year’s event, it was clear that the focus of this ONS will be on SDN that is actually seeing the light of day in real world production environments and this is a key message Brocade will have on display. For example, in our booth (Booth #102), you will see two demos; one showing how Brocade is implementing OpenFlow on its high-performance routing and switch product lines in Hybrid Port Mode. This is an innovation that only Brocade can offer to customers so they can easily deploy OpenFlow into existing networks as an overlay by running OpenFlow traffic concurrently with traditional routed traffic on the same port. In another demo, we will show a proof of concept of OpenFlow simplifying operations for a high-value software application for real-time network analytics. In addition to the OpenFlow demos, come by and learn about some of Brocade’s other SDN innovations such as Vyatta and how their software networking solutions can empower an SDN. Although, the OpenStack Summit is happening the same week up in Portland, you can swing by the Brocade booth at ONS to hear how Brocade is taking OpenStack to the next level across its product portfolio for agile cloud orchestration. Brocade was a hardware company from the start so of course we will have some gear at the booth that is centerpiece to our SDN strategy; the MLX Series Core Router, VDX Series Ethernet Fabric Switch, and ADX Series Application Delivery Switch. You can also stop by the NEC booth to see the Brocade MLXe in action in an interoperability demonstration showcasing our commitment to open SDN solutions.
On Tuesday, during lunch and when the exhibits will see the most amount of foot traffic, Brocade will present on its unique and differentiated approach to OpenFlow with key benefits, applications, and use cases for Hybrid Port Mode that customers are looking at and implementing today. Come by the exhibition hall theater at 1pm on April 16 for this is fascinating presentation. On Wednesday, the final day of the conference, Brocade’s Service Provider business CTO and chief scientist, David Meyer will join the closing panel with representatives from Verizon, VMware, and LightSpeed Ventures and chaired by overall event chair Guru Parulkar.
All in all, ONS promises to be a very interesting and educational event. I am especially looking forward to hear what the SDN users have to say, such as entities from the service provider, cloud, Web2.0, and research and education spaces. What about you?
Check out the Brocade Communities Event Page for more information on our involvement at ONS.
If you’re reading this, you've likely already seen the breaking news about the multi-vendor, open-source SDN initiative dubbed “OpenDaylight Project.” As a founding member, Brocade committed early to this project and it’s great to see the news out in the open now.
When briefing industry analysts last week, one analyst posed this question: Why was OpenDaylight initiated by a multi-vendor effort instead of being driven by customers with a collective interest, as was the way with Open Networking Foundation? It was a perfect question to expose the very essence of what is needed for Software-Defined Networking to progress.
The concept of SDN does not discriminate between the types of network infrastructure being programmed and managed. SDN elevates the discussion beyond a product type – say, switches – and extends to routers, storage networking, security, load balancing and application delivery. SDN also elevates beyond hardware networking, to include software networking as well.
In order to enable entire networks to be programmable, it became apparent that the first move had to come from the vendors themselves. They had to first agree to a collective, open concept in order for the industry to move forward and innovation to thrive. As such, OpenDaylight is the most ambitious example of openness the networking industry has seen in ages.
To be clear, OpenDaylight is not about creating standards; it’s about creating code and reference designs that can be leveraged in a multi-vendor infrastructure environment. The OpenDaylight code base will change rapidly: Counting just the developers already committed by vendors (and not adding in the come-one-come-all developers a project like this attracts), the OpenDaylight code base will have been advanced by 100 man-years just in the next 365 days. That’s phenomenal inertia for customers to leverage, and it will all be available in the battle-tested open-source licensing model of Eclipse.
This is truly a new era in networking. Vendors are agreeing to an abstracted controller as a level playing field and agreeing to advance its capabilities. This shifts the vendors’ focus of strategic advantage away from traditional speeds and feeds and toward the ability to support rich, value-adding network applications.
As with all new eras, they have to start somewhere. OpenDaylight does that with aplomb. As customers seek to differentiate their business differentiation through network capabilities, this is going to be one fascinating ride.
I recently returned from IETF 86 and would like to provide a short update on the SDN related activities. While the ONF continues to drive standards around OpenFlow and continues to promote SDN use cases and solutions, the IETF is now becoming more involved in SDN related technologies, protocols and standards.
Here is a link to a previous blog where I talked about some standards activity related to SDN. That blog will also provide some background information on various IETF WGs.
MPLS Working Group’s
The various MPLS WG activities focus on many things specific to service provider networks; and interesting enough, there is now becoming a more evident relationship between the various MPLS WGs and the SDN solution space.
For example, in the L2VPN WG there was a discussion of a VXLAN over L2VPN Internet Draft (ID). This would provide layer-2 MPLS connectivity between data center VXLAN or NVGRE logical overlay networks. This is a pretty cool use case and it appears to be a needed solution if VXLAN/NVGRE solutions become more widely deployed in data centers. A somewhat related topic was discussed on how Ethernet VPNs (E-VPNs) could be leveraged to provide a data center overlay solution. In this context, E-VPNs are based on MPLS technologies. While this solution revolves around Network Virtualization Overlays, it was discussed in the L2VPN WG due to it leveraging MPLS technologies. This same Internet Draft was also discussed in the NVO3 WG.
In the L3VPN WG, there were also quite a few IDs that overlap with the NVO3 WG and data center overlay technologies. The general support for MPLS-based solutions for data center overlay architectures appears to be gathering momentum. This is only my personal observation after attending these meetings. But it is interesting to notice that the various MPLS WGs are becoming more involved in data center overlay solutions; often including SDN-like solutions. This makes one wonder where this might be heading ...
Specific to the L3VPN WG, drafts that could be considered related to SDN are VXLAN/NVGRE encapsulation for L3VPNs and the activity around the virtual PE and CE.
While activities in other WGs, such as PCE and ALTO, could also be considered related to the SDN solution space as well, I will discuss those in my SP community blog so please go there for additional details.
The Newer IETF Working Group’s
Now on to the more interesting (and controversial?) WG activities! Of all the WG meetings at this IETF, NVO3 was the most heavily attended. It was practically standing-room only.
As you may recall, NVO3 is focused on the data center overlay problem space and architecture. This is not to be confused with DC “underlay” architectures, such as TRILL. Ethernet fabrics, whether based on TRILL or some other protocols, are considered an underlay technology; while an overlay technology is a logical network construct that leverages the many benefits of Ethernet fabrics.
In earlier NVO3 meetings, overlay technologies such as VXLAN and NVGRE were often discussed; while more recently, a lot of the discussions now include MPLS based overlays. This particular meeting was heavily focused on the charter and framework of the NVO3 WG, rather than what an architecture or solution might look like. I believe what happened here is that there were too many solutions being offered as part of this WG; while a clear definition of the charter and the requirements of the problem space weren’t fully specified and agreed upon. So, this WG has some re-chartering to accomplish with the intent of having an architectural framework and clear requirements defined by the next IETF in Berlin. Lots of work to do here!
Also discussed in the NVO3 and L3VPN WGs was the need for “inter-subnet routing”; in other words, layer-3 routing between IP subnets and/or between logical network overlays. I kept thinking of Vyatta during these discussions.
Another WG that should also be followed by the SDN community is I2RS. Like NVO3, this WG is fairly new. The primary goal of this WG is to provide a real-time interface into the IP routing system. While some could say this activity is not SDN related, I think it’s close enough to warrant a mention here. There is also more on this topic in my SP community blog.
I’ll close this blog out with an update on the SDNRG. Brocade’s SP CTO, Dave Meyer, kicked off the meeting with a high-level architectural discussion of what SDN is really all about. This thought provoking talk is aimed at helping to “bound” the SDN problem space. SDN means many different things to many different people, so this talk is intended to get the audience on the same page.
A presentation was made on a Software Defined Internet Exchange (SDX) proof of concept. It uses a Brocade switch as the cross-connect fabric! The idea behind a SDX is to use a controller to peer (ie. BGP) in the control plane, and use OpenFlow to instantiate the connectivity (ie. flows) in the SDX data plane. What a cool use case! This reminds me of the early days of MPLS (circa 2000’ish), when an MPLS-based Internet Exchange switch was being talked about and a similar proof of concept was tested. Although that concept did not take hold (all Internet Exchange Points or IXPs are Ethernet based), it helped promote and eventually validate MPLS as a deployable networking technology. Could this SDX proof of concept hold the same premonition for OpenFlow? Here is the diagram from the presentation.
[Diagram from IETF 86, SDNRG WG, SDX: A Software Defined Internet Exchange, by Nick Feamster]
A very interesting talk was given on Network Functions Virtualization (NFV). This work is not coming out of the SDNRG, but a fairly new organization called the Industry Specifications Group (ISG) is focusing on NFV. The ISG members are all large service provider carriers. This particular talk was about virtualizing Broadband Remote Access Server (BRAS) and Content Distribution Network (CDN) functions onto an x86 platform. It was basically an “acid test” to validate this can be done and the performance was shown to be pretty good! It didn’t have all the features and functions typically found in a BRAS, but it had enough functionality to validate the network functions virtualization capability.
A mention of the FORCES WG is worth bringing up as it relates to SDN and more importantly, OpenFlow. A presentation was made explaining how some of the problems the OpenFlow movement are experiencing and working to solve have already been solved in various FORCES activities and implementations. So it was recommended that those folks who are developing OpenFlow solutions leverage the work and experience from the FORCES community.
So, that wraps up this short update on IETF 86 activities that are related to the SDN solution space. I hope you found it useful. And don’t forget that the ONS event is coming in April!
Onward to the SDN-enabled cloud …
Warning! "Peak summit season" is upon us!
Sure, “peak” and “summit” are synonyms, but I didn't mean to be redundant. Last week I was at the SDN Summit in Paris (with side trips to Italy and Germany), next week I’ll be speaking / chairing / organizing at the Ethernet Technology Summit, and then in mid-April I’ll be attending the Open Networking Summit. For variety, I’ll finish at the ONF Member Workday(s) right after ONS. Besides a surfeit of “conference chicken” and a dearth of business cards, what will be the upshot? Well, I’m seeking a richer sense of the networking market zeitgeist, and Paris did not disappoint in that regard. There was a good range of talks from academic to actual-doers, as well as from numerous vendors (including one by Brocade’s Dave Meyer). One that particularly amused me was by Kireeti Kompella of recent notoriety for his Juniper-Contrail-Juniper travels. His talk was titled “SDN: OS or Compiler”, and he advocated an altered viewpoint in SDN: We see SDN as an operating system, but we should think of it as a compiler. We should tell the system “what we want” from the system, but instead we often prescribe “how it should do it”.
I may not always agree with Kireeti, he’s right on the “what, not how” point of view. My first point of amusement derives from the fact that, entre nous, this what-not-how approach was central to the grassroots movement within the ONF in October of 2011 which led to the creation of the Forwarding Abstractions Working Group.
A second point of amusement is that the compiler question is partly a bit of prestidigitation, suggesting that the problems are now gone when in fact we’ve merely tucked them up our sleeve. Suppose we agree on this compiler notion. The next round of questions is, what is the language we will use for describing our SDN needs? For the real issue is, what are the abstractions that we’ll be dealing with? And at some level that’s another sleight of hand, because then we will need to define what we mean when we say “connection” and “endpoint.” (E.g. if tunnel endpoints exist in a device, does that mean the device is an endpoint?) And once we have bashed out some language, then we’ll have arguments about where the compiler resides. Because it’s the compiler’s job to translate the “what” into the “how”. At which layer does this translation occur? Is there only a single layer of translation? Etc.
Don’t get me wrong; these are important questions. And I’m actually glad that Kireeti used his 25 minutes to articulate the what-not-how notion, prompting some good discussion. And yet, we should not pause in our development of SDN to iron out these questions, because pausing won’t improve the outcome. Better to build on the SDN momentum by delivering incrementally and iteratively better solutions over time. In a way, this topic sets the stage for my talk at ETS next week, which I’ll give a mini-preview of here.
This “compiler” discussion is relevant to OpenFlow in part because many consider OpenFlow to be “the x86 instruction set for the network.” What a great idea!! Why didn’t we think of this before! Well, actually, attractive as it is, unfortunately we didn’t “think of it before,” since it turns out that our networks have no x86-equivalent platform. Often when this minor detail comes up, voices (basso profondo1, typically) are raised that “we’d be better off with common platforms, like the compute industry has!” Possibly... But strident appeals merely distract from the reality that our deployed network silicon has a myriad of “instruction sets”. So. Here we are. We’ve created an instruction set, and it doesn’t quite match our “network computers”, the devices that constitute our networks. What do we do?
Well, if we had the ideal alternative in place, I would say that we should flush everything and switch to this new ideal. But we don’t yet have a better answer. Or, more accurately, there’s nothing close to a consensus on what the better answer is. And so I recommend that we move forward with what we have and improve as we go. Is it perfect? No. Is it better than sitting and complaining? Yes. Not only that, there are historical precedents. RISC architectures were arguably better (more efficient, whatever) than x86, but x86 has been a heck of a workhorse and served as a solid platform. Through beaucoup tweaks and enhancements, IPv4 has far outlived many predictions of its demise until address depletion is finally forcing the adoption of IPv6 (which, nota bene, had its own detractors, of course). So the idea that we might make do with the “OpenFlow x86 instruction set” for the time being might just work out, in the same way that we just managed to squeak by with that instruction set in the compute world.
What happens when we “compile” our network desires into an instruction set that’s out of sync with our underlying platforms? Well, what happens is that those instructions become the new “high level language”, and we need new compilers, or perhaps interpreters (easily built on servers or NPUs), to execute the requested functions. As it happens, FAWG (my ONF working group) is also addressing how to map OpenFlow to ASICs. My talk at ETS next week, entitled “Bringing OpenFlow’s Power to Real Networks” visits how we can deal with this.
My apologies to all you CS majors (those who reached nostalgically for your copy of the dragon book on compilers) for merely tickling the surface of compilers in this post. I’ll happily engage with y’all at the expert table next Wednesday night. But if you come, you’ll be the compiler expert; as a Physics major, I stalled out in Chapter 2.
Image credit : Lance Burton, via www.magician.org
Note 1: Is it me, or it really odd that there is no Italian list?
Stu Elby, the Chief Technologist at Verizon, has been evangelizing how SDN will help create service aware networks, and how a use case such as service chaining can be one of the early applications for SDN within the WAN. There is a video on YouTube http://www.youtube.com/watch?v=M15h0ik9i7g where he and Joe Constantine, the CTO of Sales for the Americas at Ericsson, are interviewed by Martin Taylor, the CTO of Metaswitch. They are both in agreement that this use case provides key merit and many benefits. Stu explains how today Verizon offers a large range of services that are consumed by the mass market, all of which currently run over the IP network. As the network operates today, it is very costly because these services are stitched at the IP layer in static fashion and go through many appliances, firewalls, screening devices, and other value-added devices, whether the customer signed up for it or not, because the network is architected via source and destination routing.
He sees SDN as a mechanism to provide service aware routing, which includes traffic steering and global load balancing, and will enable not only the information in the packet but the subscription of the user and what type of service is coming over the network(e.g. a YouTube click or a SIP call). All of this information will be taken into account and packets will be moved through the network more efficiently as the intelligence will be there to know how to connect through certain appliances and skip others based on policies and QoS. Essentially, this entails tying the individual subscriber information accessing the service, like user profiles and network policies, with the static information that already comes through the network to the information in the packet header.
They both feel that there are some key benefits, the first of which is adding value to the customer experience, as well as equipment savings, and CPU savings. Stu further goes on to say that this service can be run as an overlay to an existing IP network using the SDN framework with or without OpenFlow -- however he sees OpenFlow providing additional efficiency as a software-enabled architecture.
These thoughts fit very well into the Brocade vision of using SDN and OpenFlow as an overlay network, and this example should be highlighted as well as the benefits that it will provide within the WAN.
I spend most of my work time with my head down (figuratively, anyway) focused on SDN technology details and working with many others who are savvy about frame layouts, standard protocols and configuration management. Then in my personal life, when my neighbors ask about work, I exuberantly ramble that SDN has made networking exciting again. They smile and nod, then comment on the weather. Apparently networking is not something they spend much time thinking about. I don’t often get to hear from folks who actually think a fair bit about networking in terms of a service or utility rather than at the level of bits-and-bytes. So I was thrilled recently to be able to hear new perspectives.
A few weeks ago, I attended the US-Ignite “Developer’s Workshop” in San Leandro, which was all about identifying and building cool apps to take advantage of “unconstrained” network bandwidth. Two weeks ago I attended the 2013 TED conference in Long Beach. Both events offered opportunities to hear what the larger world wants from networks.
The US-Ignite event was a forward-looking “BarCamp” style event with attendees ranging from application developers to researchers to vendors. Much of the focus was related to finding and highlighting applications that clearly demonstrate the value of gigabit bandwidth. Google Fiber was a recurring theme, and indeed two Code for America developers were in attendance mere hours before their flights to Kansas City. Distance learning and telemedicine and other cool applications were discussed. Interestingly, SDN and OpenFlow (usually treated as synonyms) were routinely mentioned in a kind of offhand way whenever some new complex functionality was anticipated or desired. As in, “Of course we’re all going to be videoconferencing constantly from our desktops and mobile devices, and OpenFlow will ensure we get the quality of service we need.”
I was pleased that application developers fully expected to use a network-aware API to express their needs to the network, but then surprised that they saw the interface as pretty much unidirectional. That is, the application says “I need this much bandwidth now”, and (if the app is really nice) maybe “I need the bandwidth for XYZ minutes” or “for QRS gigabytes.” The expected network’s response seemed to be limited to “Okay, gotcha” or, at worst, “Sorry, missed that… can you repeat please?” I asked what the network should say when it’s out of resources (as when each house in the neighborhood is watching 2 separate HDTV movies at the same time). Can the network say “Gosh, I’m sorry, but I’ve only got half of that right now, but should have it at 8:30pm, which would you prefer?” That comment drew frowns and responses like “next gen networks should solve those problems for us.” As if ubiquitous “unconstrained” end-to-end bandwidth is just a year or two away, even though we all know that provisioning more bandwidth prompts deployment of new bandwidth hogs. Will Barkis of Mozilla, with his neuroscience background, piped up that we will quickly exceed human sensory bandwidth. But look out! Here comes the Internet of Things to consume a lot more!
Here’s where it gets interesting. Someone chimes in “Why isn’t this like electricity? My utility company makes it easy to use more power!” Ah, good point. But when we use electricity we pay per kilowatt hour and even varying rates by time of day. If we paid our ISP that way, would we insist on guaranteed high bandwidth or bandwidth scheduling? Could the ISP deliver that? When will apps be smart enough to talk *with* networks and not just *over* them? All this highlights our chicken-and-egg situation. Providers would probably be happy to charge for premium services, but how can they charge for them when most connections extend beyond any individual provider’s control? How can scheduling work when apps don’t ask for what they need? How can apps ask for what they need when there’s no standard “network orchestration” service defined? How will providers and vendors develop that standardized service when demand is fuzzy and interoperability is required before the first dollar will be spent? Although I’ve had technical conversations with colleagues about many of these issues, the discussions usually miss the system level economic or deployment challenges. The workshop was very effective at getting each of the attendees thinking outside of our familiar brain boxes.
In contrast to US-Ignite, this year’s TED Conference was not networking focused. But, uncharacteristically, it did have a more than one Internet-focused presentation. Danny Hillis spoke on the vulnerability of the Internet to failure caused by either a cyber attack or inadvertent mistakes. Given our world’s growing dependence on the Internet, he asserted that we need a “Plan B” network that critical services (hospitals, first responders, etc) could rely on in the event of an internet outage. When asked, he said that the cost of Plan B wasn’t cheap, but worth it, perhaps a few hundred million dollars. He also said that “Of all the problems at this conference, this is probably the very easiest to fix.” (Well, maybe… or maybe not.)
Vint Cerf, recognized Internet ancestor, who participated in a session on interspecies Internet, got a chance to comment on Hillis’ Plan B idea. He essentially dismissed the concern as “unlikely,” saying that the Internet had been around for more than 30 years without much trouble. This came just minutes after Cerf had mentioned that IPv6 had just launched in the past year.
My own sense is that both speakers, in compressing their comments for time, and simplifying them for a lay audience, glossed over a lot. When Hillis suggested that Plan B would need all-new protocols, completely different from existing IP stacks, he skirted past the fact that dependency on IP as a protocol is ubiquitous. A dozen implementation challenges popped up in my mind, and I quickly imagined that Hillis’ cost estimates might be low by a factor of ten or more (though perhaps some clever approach might mitigate this). But in contrast to Cerf, I felt that Hillis was right about our risky (and growing) dependence. A major Internet outage might be viewed as similar to a meteor impact or mutant supervirus pandemic: unlikely but potentially catastrophic and worthy of consideration.
It was clearly not the point of US-Ignite or TED to sort out all the issues that were raised. Instead, these events served to raise awareness and catalyze innovation and progress. I personally valued the big picture perspectives that I got from TED and US-Ignite, and the way that they reminded me that as we lay bricks we must not lose sight of the eventual cathedral.
(Image credits: TED session photo by James Duncan Davidson/TED; Sagrada Familia Cathedral by Bernard Gagnon via wikimedia.org)
New software networking products are coming to market at an increasingly rapid pace. While Vyatta was first (by years), a slew of virtual and SDN-related solutions have been announced over the past few quarters, including within the category of virtual router. It’s easy to get confused over how all these products impact the march to SDN architectures.
For example, there is a bevy of security virtual machines in market now. Research firm Infonetics estimates the market for VMs that perform firewall, VPN, IDS/IPS and content security is roughly $1B currently.
However these are Layer 4-7 products, delivering functions that make the network better. The base configurable network is Layer 2 and 3. That’s why in the traditional hardware networking market the L2-3 space dwarfs L4-7 by roughly 5:1. So in software terms, the march to SDN needs to focus on the L2-3 space, with the latter being defined by a virtual router.
Today the vast majority of virtual routers are deployed in Virtual Data Centers (VDC) and Cloud Service Providers (CSP). This is highly correlated with the fact that these are the most highly virtualized production environments; and when there is application VM density there is the need to manage the traffic between those VMs. That’s the job performed by switches and routers – or in software terms, vSwitches and vRouters.
As every hypervisor already has a vSwitch, the first actionable step for customers in driving to a SDN architecture is selecting and implementing vRouters. What is the difference between the vendor options? And how do those differences relate to the virtual environment requirements? The wrong choice can severely impact your IT strategy by throttling performance, limiting agility and even limiting the kinds of compute environments that may be desired.
Performance is enabled by the vendor - either through know-how or desire. It takes time for a vendor to work through the different challenges that stand in the way of router performance in a virtual world. A recent NetworkWorld article highlights the substantial differences that can exist between virtual routers.
If you are evaluating a virtual router from a vendor that sells hardware routers, beware their lack of desire… the faster their virtual router goes the more they cannibalize their expensive hardware routers. Their only option is to make the virtual router go faster, but charge an outrageous amount for it in order to cover their lost hardware revenue. (For more on this, see guest blog on SDNCentral.com titled Software Centricity and the New Normal.)
Obviously, artificial performance throttles put in place by the vendor also impact your infrastructure’s agility. Buying with headroom to grow is a big part of that.
Agility and choice are also reduced if the vendor limits the deployment environments. For example if the vRouter is tied to a specific vSwitch, that can limit your hypervisor choices – and even your entire cloud computing architecture. If you are building a hybrid cloud (your own VDC with linkages into a CSP such as Amazon) you need your vRouters to be deployable in both environments.
So in the march to SDN, keep these reminders in place:
In a virtual world, there are new things to consider when making vendor choices. Keep your options - in a word - "open."
Service-driven organizations are compelled to make the leap into customer-driven organizations, by building a highly scalable and available data center while providing guaranteed service delivery on-demand. The openness and programmability of the networks, the automation of management, and the flexibility of virtualized infrastructure are the essential requirements to proactively engage in a moment’s notice to facilitate the increased complexity of applications and user demands.
The on-demand nature of data center requires the ability to act in either a proactive or reactive manner based on the nature of the workload required, then to distribute the resources seamlessly within the existing infrastructure. To better characterize these requirements, we will discuss the two most practical solutions which are dynamic resource provisioning and distributed resource management.
Dynamic provisioning — the ability to automatically spin up new instances of application resources as workload conditions demand—is one of a key requirement for fully realizing the benefits of a highly virtualized data center. Ideally, the goal is not only to provision and de-provision virtual compute resource or simply move applications and data around, but to actively monitor and direct traffic while dynamically managing network resources. To accommodate an elastic environment such as this, the supporting network services within it must also adapt. When the network tier can’t change as rapidly as the resources behind it, it makes the data center becoming more vulnerable to critical failures and unable to meet the demand. To better control this highly dynamic environment, you need to know how applications are performing, how applications are being delivered, and how traffic is being controlled and directed to the available resources.
Distributed resource management utilizes specific calculation model to determine whether a virtual resource cluster is balanced. This functionality rightfully serves what the virtualized data center needs as workload is shared across an even distribution of hosts in the cluster. If one of the VM requires additional resource than the average level of any other hosts, the system will restore balance by distributing the load across the cluster. Inevitably, the application administrators need the combination of the application performance and the metrics used to calculate underutilization or overutilization of the resources to better tune the environment for optimal results. Without having the administrator understands how each service is utilizing the metrics, what metrics are used and what actions to take, any administrator will have a tough time scaling out the management and distribution of all the different sets of application resources.
Trying to manage data center resources and utilize them effectively while preserving application performance in a virtualized environment has been the core objectives and often challenges for any virtual infrastructure management software. A VM resource management system alone may not feasibly scale to the number and diversity of hosts and VMs supported by today’s modern cloud service providers. What happens when there are interdependencies amongst the VM hosts, the application, the network, and the management systems but there is no tightly integrated API call to communicate or share the messaging in this heterogeneous virtual environment? Other challenges include communication constraints between applications and underlying systems, lack of integration between the network and virtual machine resources, and limited visibility to user demands and application behaviors. It’s like a blind date. Nobody knows what to expect but everyone is eager to do the best possible effort to make the best impression.
What we have to consider is an infrastructure component that is tightly integrated to enable the automation, migration, and scale of applications while increasing visibility across the compute, network and application delivery tier. It needs to combine the application management intelligence of the application delivery network tier with a scalable, business-level policy engine to automate application resource provisioning and management of infrastructure resources. To characterize this further, the infrastructure component acts as a broker between the application delivery network and the underlying application resources which simplifies the on-demand provisioning of application and network resources within a virtualized data center. This application resource broker ensures optimal application performance by dynamically adding and removing application resources as demand requires. The broker works in tandem with the application delivery functions to provide these capabilities through real-time changes in traffic demand, application response time, traffic load, and infrastructure capacity from both compute and network infrastructure. As demand reaches the configured threshold for an application, the system will initiate provisioning actions to ensure that necessary and appropriate application resources are available to meet the defined Service Level Agreements (SLAs). Together, this holistic approach will increase data center availability and drive service innovation through automated, self-service provisioning models that quickly adapts to changing conditions based on infrastructure performance, application need, and user demand.
There are many core cloud use cases where this functionality can play a great deal of significance especially for enabling cloud bursting for a hybrid cloud service use case or for enabling business continuity across globally distributed data centers to automate application resource mobility in the event of a disaster. In the use case of business continuity, the broker can enable a seamless redirection of both new and active users when VM migration occurs between data centers, thus avoiding the risk of a single data center failure. Because the broker is tightly integrated with the VM resource management, it can automatically detect VM movement across sites and ensure an undisrupted end user experience by redirecting client sessions to the right VM cluster in a manner that is fully transparent.
In order to accommodate the changing cloud environment and the varied management requirements, the broker too needs to seamlessly integrate with custom and third party virtual management suites and open orchestration frameworks via the combination of northbound APIs (NBAPI) and standard-based application messaging protocol (such as Advanced Message Queuing Protocol – AMQP).
At minimum, the application resource broker needs to have these five critical elements to fully support your dynamic service-driven data centers and facilitate the increased complexity of cloud based infrastructure:
Please post your comments about what challenges you face in achieving a self-service, on-demand deployment model in the data center and the significance of having an infrastructure component or an application broker to aid your objectives.
For the next topic, I will discuss cloud bursting as the enabler for hybrid cloud service and the components that aid in achieving the ideal resource orchestration and facilitate extensible programming or customization.
I just returned from the fantastic NANOG57 (North American Network Operators' Group) conference in Orlando last week, where we had an excellent combination of tutorials and talks on current topics for network operators. We had some BoFs, panel discussions and even a few presentations on SDN at past NANOGs, so for this conference we wanted more focus on the topic. We invited the team from Indiana University (IU) GlobalNOC to come and talk about what they are doing with Internet2’s Open Science, Scholarship and Services Exchange (OS3E) initiative.
Monday morning started off with a comprehensive 1½ hour tutorial on OpenFlow that explained SDN concepts and the OpenFlow protocol in detail. The room was packed and it was one of the best-attended tutorials we’ve ever had at the conference, especially considering it started at 9:00 on a Monday. What I really liked about this tutorial was that it was highly interactive, and used lots of hands-on exercises to demonstrate key concepts in a series of steps that built on each other. First, basic OpenFlow controller operation was demonstrated, then attendees added port-based and IP-based rules to forward traffic, and verified that the rules were working.
After the tutorial, the NOGLab opened where IU had setup their OE-SS demo (NOGLab information starts at slide 30). The demo used a rack-sized network to show in concept what Internet2 is using in production to provision Layer 2 circuits over their national 100 GbE backbone. A multivendor network with Brocade, Dell, IBM and NEC switches was connected together to form two separate backbone networks and OE-SS was used to control the provisioning via OpenFlow. Two demo stations were available to let people use point-and-click provisioning to build circuits over the backbone. We had great attendance throughout the demo, which was open during the scheduled breaks in the conference agenda. There was a steady flow of people during all three days who lingered and talked with the IU staff about OpenFlow and their experience with SDN.
My takeaway is that the SDN content at NANOG57 was received with great interest and was a very successful addition to the program. Network operators were given the opportunity to not only learn about OpenFlow, but also to see it running in the demo and to play around with it. I’d especially like to thank Ed Bales, AJ Ragusa, Chris Small, and Steve Wallace for giving us such a great tutorial, an awesome NOGLab demo, and for all the valuable discussion they had with people at the conference. As we start to plan the program for the next NANOG, I expect that we’ll continue to add more content on SDN and OpenFlow.