Back in September, we announced the Brocade Vyatta Controller, a commercial package of the OpenDaylight controller. We did an initial release in November and since that time have been working with our early adopters to understand and respond to the full range of their needs.
The single biggest challenge for any organization looking at SDN is simply getting started, non-disruptively. That means not having to spring for a lot of new equipment, but it also means supporting the organization through a ramp-up in skillsets and a shift in processes. So we’ve spent a fair amount of time developing ancillary education and support services in addition to making it easy to acquire the controller in the first place.
Dans notre secteur, il y a deux écoles : ceux qui pensent que le futur des réseaux réside dans l’ouverture, et ceux qui pensent qu’une approche propriétaire est indispensable.Read more...
I’ve spent the last few months working closely with the OpenDaylight and OpenStack developer teams here at Brocade and I’ve gained a heightened appreciation for how hard it is to turn a giant pile of source code from an open source project into something that customers can deploy and rely on.Read more...
The SDN controller of today can be likened to Clark Kent who plods along doing one thing well – print journalism. The SDN controller that we need is more like Superman – seeing, hearing and protecting everyone and everything. And once we see the SDN controller donning the garb of Superman we will never go back to Clark Kent will we? Read on….Read more...
Why technical philosophy matters, and how it ultimately shapes what gets delivered to users.Read more...
Join Brocade and Intel as they walk you through the NFV journey highlighting the what, how, and why this innovative and disruptive focus is critical to the next step of innovation of your business and operations.
This webinar will walk you through example use cases leveraging Intel x86 solutions, Intel’s SAA, Open Orchestration platforms and next generation NFV products like the Brocade Vyatta vRouter.
WHEN : Aug 26 2014 10:00 am PDT
Introducing the Brocade Terminology Guide: a guide that will aim to help people better understand, organize, and absorb a portion of the many terms around campus by use of strong visuals. Every week, new posts will be available on Brocade Communities and will be focused around a certain theme. Be sure to also join the conversation on social media via #BRCDology.
Since this week marks the launch of the Guide, it will be all about going Back to Basics. The terms “SDN” and “NFV” vibrate at a constant stream around here and throughout the networking industry, as if someone whispers it through the walls. So, what really is the buzz about SDN and NFV? Read more to find out!Read more...
The OpenStack Summit that took place in Atlanta two weeks ago is the first I’ve attended myself, so I gave myself some time to absorb my own experience and to read the subsequent press coverage before adding to the noise.
As I boarded my plane home, I tweeted: “Top 3 Q's I got asked at #openstack: 1) Roles of ODL vs Neutron? 2) How to make money in open source? 3) When will OpenStack be usable?” I’ll address #1 and #3 below…#2, of course, remains the multibillion-cowry shell question.
What should be the respective roles of ODL and OpenStack Neutron?
At the 10,000 foot level, it can sound like OpenStack Neutron and SDN controllers do more or less the same thing—provision and manage networks for cloud consumption. This has led to some discussion as to what should be done by which entity. Before jumping into the theorizing, though, it’s good to have an understanding of the current state of Neutron, the networking module for OpenStack. Rivka Little of SearchSDN provides a good summary, starting in paragraph 3 of this article. Most critically:
“Inside the cloud, Neutron works with virtual switches and hypervisors to configure ports and devices, and provision virtual overlays and tenants. But a Layer 3 agent is responsible for connecting these tenants out into the data center and the Internet. All traffic in a single cloud environment runs through that same L3 agent, which creates a choke point. That Layer 3 agent also lacks dynamic routing.”
That’s not to say, however, that OpenDaylight or other controllers are replacements for Neutron. Neutron is simply the newest portion of OpenStack and evolving rapidly. Meanwhile, a number of the developers working on Neutron are also working on OpenDaylight, especially on integration between the two projects. Rivka Little’s companion article, Do OpenDaylight and OpenStack Compete or Complement?, provides a good picture of the real focus of the discussion ,eg the appropriate degree of abstraction for Neutron.
Using OpenDaylight with OpenStack (slides and video with demo) gives a generic overview of how the two work together. For a more concrete picture, this demo shows how to instantiate and orchestrate some Vyatta vRouters with OpenStack and OpenDaylight.
Supporting Multi-Tenancy Between Data Centers
The SearchSDN quote above highlights a key use case for OpenStack: maintaining tenant connectivity and policy in clouds constructed from multiple physical data centers. Brocade has been working with Huawei on a proposal to do just that, which was shared at the Design Summit portion of the Atlanta event. You can read the blueprint here.
One of the blueprint authors, Mohammed Hanif, talks about it in this video:
This brings us to the “usability” questions: not When, But By Whom, For What?
First, some data points: I’m told there were virtually no actual user organizations at the summit a year ago. I don’t know the actual attendance breakdown in Atlanta, but anecdotally, roughly ¼ of the badges I had time to eyeball there were user orgs. Although the headliners (Walt Disney, Wells Fargo) are very large companies, most were not—something borne out in OpenStack’s own User Survey, which shows that 60% of deployments are in companies of under 500 employees. At the opposite end of the spectrum, most of the large, non-vendor organizations present were not large enterprises, but telcos. Which brings me to the point of this section.
Last fall, Geoff Arnold wrote a much-discussed post called Whither OpenStack, in which he debated whether it made much sense for enterprises to try to use OpenStack for private clouds. He also made reference to telcos having some specific needs of their own.
Consensus now seems to be emerging that there’s room for enterprises to develop and grow private clouds on OpenStack over time, especially with the certifications and hardened distributions that are now more firmly in place.
At the same time, NFV has seen remarkably rapid adoption in the telco space, and there’s clearly an appetite for using OpenStack to orchestrate those functions. Some additional work (slides) needs to be done within OpenStack to really support that, but we may well see a telco-centric body of work emerging over the next few months, with NFV at the core.
Late-breaking: a scrap over Fibre Channel in OpenStack
In a post last week, Stephen Foskett questioned whether there’s even a need to include Fibre Channel initiatives within OpenStack. J Metz responded this morning that it’s a moot question, since Fibre Channel is already implemented in OpenStack, with more proposals for Juno release. Instead, Metz says, “the question is related to the level and extent of which Fibre Channel can be managed and controlled by some type of orchestration layer using OpenStack infrastructure.” It’s a good question, and one that will be answered as users road-test it and OpenStack capabilities evolve.
In the meantime, if you’re curious where Fibre Channel is today in OpenStack, here’s a brief talk given by Andre Beausoleil on the FC Zone Management capabilities implemented in Icehouse.
One final observation
Recent IDC surveys indicate that cloud-service providers and enterprises understand how SDN can bring their network infrastrucure into better alignment with their workloads and business objectives.Read more...
DDoS attacks are on the rise. That statement by itself might not be that interesting because in this climate of "cyber insecurity", it probably wouldn’t surprise anyone that the number of attacks is increasing. However, what is more interesting, and even more troubling for networks, is that the size of these attacks is on the rise, with some attacks reported at over 400 Gbps. And yet the same solutions that were used for much smaller attacks are being used in attempt to detect and mitigate these security threats.Read more...
Brocade is active in the OpenStack community. This year our team members have collaborated with several partners and customers to offer pragmatic and future-looking talks on a range of topics, from the relationship of OpenStack and the OpenDaylight Project to new techniques in NFV management and Fibre Channel Zone Management. We invite you to vote for those of most interest to you.
In the span of just over one year “Network Functions Virtualization” (NFV) went from a non-existent term to a dominant theme at Mobile World Congress. For a trend to skyrocket from absolute obscurity to being showcased at the world’s largest networking show is a very rare phenomenon.
The importance of Intel to the NFV movement cannot be overstated. At the root of agility and economics is an open ecosystem system of software with a common, powerful, highly economical hardware platform to leverage. The hardware must be able to be delivered in a variety of form factors given the wide variety of network infrastructure that exists, but it’s crucial that there be an architectural similarity in order for the software ecosystem to thrive.Read more...
Like many areas of information technology, data centers continue to move away from monolithic and closed architectures to increasingly virtualized, dynamic and open environments. Understanding the evolving needs of our customers is why Brocade acquired virtual routing pioneer Vyatta in 2012.Read more...
Next week, arguably one of the biggest SDN events will be held in Denver, Colorado. At this event there will be more live demonstrations of SDN applications and solutions by organizations actually using SDN than there have been at any other event to date. A vibrant community will come together for nearly a week to openly share technology and ideas with one another for the betterment of
each organization scratch that, mankind. If you have an interest as to where the world of networking is going, then you should pay attention to what is happening at Supercomputing 2013 (SC13). For the last 25 years, the Supercomputing Conference has been the premier event for the High-Performance Computing community, attracting members of research and education networks, universities, national laboratories, and other public and private research institutions. It is at this event, and really this event only, that you can get a preview of what networking may look like in 2, 3, maybe even 10 years down the road.
These are exciting days in my Forwarding Abstractions Working Group (FAWG); the past 15 months of work is just now producing results that will be instrumental in enabling our SDN future. Past is meeting future in my extended present, and each week brings us closer to fruition. Some weeks ago, FAWG produced a “preliminary final” document about our framework enhancements for the OpenFlow ecosystem.Read more...
The true value of SDN will come from the real-world applications that it will enable. Although at times, when listening to vendors describe their SDN strategies, it seems like we are hearing more about solutions in search of a problem, and not the other way around. Today I am happy to be part of an announcement that is not that.
For service providers, the onslaught of data on their networks has made them re-examine and re-examine again how to build a more efficient network. The goal is to have a dynamic network infrastructure at all layers to not just handle the massive amount of traffic, but to also unlock new revenue streams by providing premium services with greater SLA’s to customers.
Today, Brocade is excited to be part of an announcement of a successful demonstration of such an SDN application. In collaboration with Infinera and ESnet, this demonstration shows how SDN can be used to provision services and automatically optimize network resources across a multi-layer network as traffic and service demands change. By leveraging an application developed by ESnet, called OSCARS, we were able to show two use cases:
Originally a research project by ESnet, OSCARS has helped scientists collaborate with one another from around the world by moving massive amounts (in petabytes) of mission-critical data generated from research and experiments. Now with the demonstration announced today, this provisioning can be done at both the routing and transport layer using SDN. You can see the details of the demo setup in the diagram below.
This demonstration also exemplifies the power of an open SDN ecosystem. One of the major tenants of SDN is to unlock new levels of innovation for network operators. By establishing APIs between layers, software components and best-of-breed network elements this can ignite innovation, enabling the development of customized applications.
This week the demo will be on display at SDN & OpenFlow World Congress in Bad Homburg, Germany. If you are there, please stop by the Infinera booth #39 to see it live. Also check out the Brocade booth #41 where you can learn more about this solution in addition to other Brocade SDN and OpenFlow solutions. If you aren’t lucky enough to be in Germany, we will be showcasing the demo during a DemoFriday hosted by SDNCentral at 9am on November 22, 2013. Stay tuned for more details on this live virtual event where you will have a chance to hear commentary from Brocade, Infinera, and ESnet, see a live demonstration, and have time for questions and answers.
Yesterday, Brocade announced at the annual Analyst and Technology Day that we have shipped more than a million OpenFlow-enabled router ports to date. So, what does that mean in the evolving world of SDN?
Have you ever had a situation when you came across a feature on a gadget/appliance you already own, that suddenly improves your productivity? Discovering OpenFlow on an MLX port is kind of similar! And the capital cost you incur to evaluate OpenFlow for a specific use case on an MLX infrastructure? Zero! Because every MLX port shipped to date can be upgraded with software to support OpenFlow.
As the market moves from the initial enthusiasm of OpenFlow to more practical deployments, it is logical to ask a few pertinent questions:
If you have MLX deployed in your network infrastructure today, try out OpenFlow. There is no cost to creating a slice of your infrastructure for OpenFlow and having it concurrently coexist with your existing network protocols/present mode of operation. You may discover a more programmatic way to solve problems such as service chaining, traffic steering or others.
Last October the world’s largest Telco Service Providers collectively published the now-famous paper titled, “Network Functions Virtualisation,” whRead more...
Last October the world’s largest Telco Service Providers collectively published the now-famous paper titled, “Network Functions Virtualisation,” which is serving as a call-to-action for the industry. Less than a year later, Brocade announces our next-generation Vyatta vRouter, the 5600.
The industry-leading Vyatta 5400 vRouter has been in production for years and is deployed worldwide. It’s a fantastic solution for multitenant workloads and is deployed in some of the largest clouds, including Amazon, Rackspace and SoftLayer. But the new Telco-driven NFV demand is different. It requires a new level of performance from a virtual router.
This is a critical business issue. The NFV movement is in pursuit of a tremendous boost in network agility in order to enable Telcos to stay competitive. They need an order of magnitude improvement in their time-to-market and adaptation of service offerings to rapidly changing demand dynamics. They also need to get their infrastructure down to an entirely new cost model.
To get there, Telcos will begin deploying substantial parts of their network infrastructure on industry standard x86 servers. If you haven’t looked close lately, the servers that Telcos will be using are the most network-centric the world has ever seen. The NFV business case assumes that software can take advantage of this modern hardware.
This is the business value of the new Brocade Vyatta 5600 vRouter. Re-architected specifically to leverage Intel’s latest and greatest, the 5600 is the world’s first virtual router for NFV workloads. With speeds that are a full 10x faster than our popular 5400 model, the 5600 can unleash the power of incredibly cost-effective servers and deliver customers a solution that enables them to meet their strategic goals of radically higher agility and lower cost.
It’s not a coincidence that the 5600 is following closely on the heels of the NFV movement; it is proof that Brocade is listening to customers and is aggressively delivering solutions to meet their rapidly changing needs. As Vyatta we invented the virtual router category; as Brocade we're rapidly taking it to new heights.
Today, Brocade unveiled its newest product targeting the Network Functions Virtualization (NFV) movement with the introduction of the Brocade Vyatta vRouter 5600 family. Featuring advanced routing capabilities including BGP and OSPF, the Brocade Vyatta vRouter 5600 is the world’s fastest virtual router with performance up to 10 times that of the industry-leading Brocade Vyatta vRouter 5400 series and is more than 40 times faster than competing products.
With the unique ability to deliver hardware-like performance as a software appliance, the Brocade Vyatta vRouter 5600 enables telecommunication and large service providers to significantly reduce capital (CapEx) and operating (OpEx) costs throughout key areas of the network, without a reduction in performance.
At the foundation of the Brocade Vyatta vRouter 5600 is the company’s vPlane™ technology, a highly-scalable forwarding plane capable of delivering more than 14 million packets per second per x86 core, the equivalent of 10 Gb/s throughput.
A core component of the “On-Demand Data Center™” strategy from Brocade and a key element of SDN architectures, NFV is a call-to-action from customers to convert and consolidate services traditionally delivered through proprietary, purpose-built hardware into virtual machines (VMs) running on industry standard high-performance servers. First adopted by Cloud Service Providers, including some of the largest such as Amazon, Rackspace and SoftLayer, the Vyatta vRouter is the most widely used NFV element in the world.
The Brocade Vyatta vRouter 5600 is currently in limited availability and open to qualified organizations. General availability is scheduled for the end of 2013.
 Based on 250 Mbps performance claim listed on Cisco Cloud Services Router (CSR) 1000V data sheet as of September 2013
When we founded Vyatta in 2006 the idea of using an x86 server as network infrastructure was pure folly to just about everyone. I’ll never forget the ridicule of one naysayerRead more...
When we founded Vyatta in 2006 the idea of using an x86 server as network infrastructure was pure folly to just about everyone. I’ll never forget the ridicule of one naysayer who in 2008 was quoted in the press as saying, “I would NEVER put a PC in my network.”
As it turns out, the world’s largest semiconductor vendor had plans that guy didn’t know about. Big, non-obvious plans executed quietly over the years to avoid attracting too much attention… a silicon crocodile easing itself toward the beach…
Intel played from their strength as the industry’s dominant processing platform, slowly absorbing networking as a native workload into that silicon architecture. With each turn of their x86 foundry, servers became better packet-processing machines. Those servers kept shipping in mass volumes, and in a short period of time Intel had blanketed the world with Network-Centric Servers.
(Approximate Release Date)
% of Servers Shipping w/ 10Gb/s NICs3
Putting this throughput into context, 14.4 million packets at 64-byte size is line-rate 10Gb/s performance. So the faster packet-processing innards quickly red-lined the basic 1Gb/s NIC, making it the new bottleneck. By next year it’s estimated that the average server will be shipping with 10Gb/s NICs.
And there you have it: The Network-Centric Server. They’re here, they’re incredibly low-cost compared to proprietary network boxes and customers have already begun to leverage them aggressively.
The adoption started becoming obvious a couple of years ago and now its hit deafening levels. Tier 1 Cloud Providers - such as Rackspace, SoftLayer and Amazon - are delivering Network-as-a-Service powered by software infrastructure on servers. All of the world’s top Telcos have formally stated their demand for NFV, which is in essence networking software running on servers. Enterprises are also starting to make the shift. And just in case you missed it, last week at VMworld, VMware finally made it clear they’re reaching for Cisco’s artificially inflated wallet.
The Network-Centric Server is about more than just slashing 90% of the CapEx out of the system. From the strategic / macro perspective, it ushers in a decomposition of system architectures in the same way that happened to compute in the 1990s. That technology disruption shattered and reconstructed an industry three times larger than the network industry. Now it’s our turn.
The crocodile’s now on the beach… and it will not be denied its meal.
1 Approximate, based on figures from Intel and Vyatta testing
2 Not yet disclosed, but hold onto your hat
3 Source: Dell’Oro
I saw an interesting article today about Internet2 that I wanted to share: http://www.healthcare-informatics.com/article/conn
Research institutes are, by design, on the cutting edge of many fronts, and data generation is no exception. Researchers are producing data in petabyte-levels today, and that number is growing exponentially. While this is great news in many ways (more data paves the way to more results), it’s creating challenges when it comes to actually transporting this data and collaborating with other research institutes. Existing Research and Education Networks (RENs) are not equipped to deal with today’s data requirements, meaning that researchers are forced to discard over 90% of data generated(!).
That’s an unacceptable number, so fortunately RENs are taking action to address these trends. Internet2, an advanced networking consortium for the research and education community, is leading the charge. In the article, Robert Vietzke of Internet2 shares three key points that he sees as necessary for supporting high-data research: 100 GbE, security and specifically addressing big data flows, and SDN. We at Brocade completely agree, which is why we partnered with Internet2 to build out their 100GbE SDN backbone using high-performance Brocade MLXe routers. The MLXe enables zettabyte levels of data transport and increased performance through full wire speed 100GbE and leading 10 GbE, 40 GbE, and 100 GbE density. And with hardware-enabled Hybrid Port OpenFlow, RENs can run OpenFlow at full wire speed up to 100 GbE on the same ports as traditional traffic. That means implementing OpenFlow without affecting performance.
The research space is a fascinating one. Pushing the boundaries to drive innovation is something I definitely support. Keep an eye out for more!
There has been a great deal of discussion of late of the relative merits of overlay approaches to SDN vs methods in which network hardware has a more active role to play. The arguments go back a year or more, and are fundamentally rooted in varying assumptions about the primacy of the hypervisor in modern data center architectures.
First, a bit of history. In August, 2012, several Nicira founders published a paper exploring the role of a physical network fabric in an SDN architecture. The paper observed that OpenFlow, and more broadly SDN, in its then-current instantiation didn’t actually solve some fundamental networking problems: most notably, it didn’t do anything to make network hardware any simpler, nor remove the dependency of the host on behavior in the network core. The proposed solution was effectively ‘smart edge/fast, dumb core’, though there were also two key observations that blunt the problems with that oversimplification.
However, as the term “network virtualization” entered the discussion, the temptation quickly became to discuss overlays as though they are completely analogous to server virtualization. Joe Onisick did an excellent job of unwinding that analogy, and I’d encourage you to read his post in its entirety, as well as the comments. His key point was this:
The act of layering virtual networks over existing infrastructure puts an opaque barrier between the virtual workloads and the operation of the underlying infrastructure. This brings on issues with performance, quality of service (QoS) and network troubleshooting…this limitation is not seen with compute hypervisors which are tightly coupled with the hardware, maintaining visibility at both levels.
In other words, network virtualization *still* does not address the fundamental problem of overly complex, rigid and manual physical infrastructure, nor prevent interdependent (physical-virtual) failure. In fact, it adds complexity in the form of additional layers of networks to manage, even while it simplifies the configuration and deployment of specific network services to specific clients. (In the absence of hardware termination, it is also unavailable as a solution to a broad swath of non-virtualized workloads, a problem which VMware is moving swiftly to address.)
Where does this leave us, then? The earlier articulation of the distinct purposes of the core and edge is important. True fabrics such as VCS explicitly address the need for simpler physical network operations through automation of common routines, as well as a more resilient, low latency and highly available architecture—a fast, simple, highly efficient forwarding mechanism. However, precisely because the physical network needs to be able to operate and evolve independently of what goes on at the software-defined level, fabrics can not be “dumb”. The individual nodes must in fact have sufficient local intelligence as well as environmental awareness to make forwarding decisions both efficiently and automatically. Not all fabrics are architected thus, but VCS fabrics have a shared control plane as well as unusual multipathing capabilities that allow them to function largely independently after initial set-up. There can also be utility in horizontal, fabric-native services that may be different from those deployed at the edge, or which may in some use cases be simpler to deploy natively.
VCS fabrics also maintain visibility to VMs of any flavor, wherever they may reside or move to, as well as mechanisms for maintaining awareness of overlay traffic, restoring the loss of visibility highlighted by Onisick. In addition, the VCS Logical Chassis management construct provides a much simpler means of scaling the host-network interface. Although VCS fabrics are actually masterless, the logical centralization of management allows the Logical Chassis view to serve as a physical network peer to the SDN controller, while providing the SDN controller a means of scaling across many fabric domains (each domain appears as a single switch), vs a plethora of interactions with each individual node. (NB I'm highlighting some of the specifics of VCS fabrics for the sake of concrete illustration, but broadly speaking, similar principles apply to other fabrics.)
Where many disagree with the Nicira stance is in the claim that an ideal network design would involve hardware that is cheap and simple to operate, and “vendor-neutral”, eg easily replaced. I would argue that what matters in terms of network portability is not that hardware needs to be indistinguishable from one vendor to the next—rather, it needs to be able to present vendor neutrality at a policy level. Hardware performance and manageability absolutely continue to matter and remain primary purchasing criteria assuming equivalent support for higher-level policy abstraction.
Or as Brad Hedlund observed over the weekend:
The ONF recently announced the formation of a Chipmakers Advisory Board (CAB) to help ONF leadership “grok” the chipmakers’ world view such that the ONF can more effectively encourage OpenFlow support in new networking silicon. After a call for applications, the ONF selected thirteen individual from thirteen member firms that design their own chips. The firms include merchant silicon vendors, network processor vendors and system vendors that use internal ASICs for their systems. As an ASIC designer, I am both honored and excited to be able to represent Brocade on the CAB. The honor seems self-evident; the CAB is a remarkable group of people gathered to positively influence the organization at the core of biggest disruption in networking since the Internet. The excitement comes because the formation of the CAB highlights the ONF’s readiness to aggressively pursue broad adoption of OpenFlow.
Here are some of the leading challenges that I see the ONF and the CAB working to address in the coming weeks.
Aligning the Hardware / Software Camps
I’ve written before about the tension between the software-centric and hardware-centric camps in the OpenFlow community. Not surprisingly, the dominant organization in the Software Defined Networking advocacy group (the ONF) has a heavy presence of software folks. The CAB, on the other hand, is definitely hardware dominant. The ONF, by creating the CAB, is aggressively seeking to bridge the hardware / software understanding gap, which is key to enabling a robust OpenFlow ecosystem.
The customers that I’ve spoken to have a strong interest in products that are OpenFlow-capable, but which also support a rich collection of legacy protocols. Such hybrid devices are essential to providing a realistic transition picture for SDN deployment. Without hybrid, customers would face high stakes binary architecture / purchase decisions at a time when applications are immature. Without a transition story, few buyers can justify the risk, and the market can stall before it really gets started. Without a healthy early market, it’s tricky to develop the apps, standards, interoperability and various other elements that characterize a robust market. A chip that supports hybrid boxes provides vendors with win-win: access to the small-but-growing OpenFlow market while also serving the larger established legacy market. OpenFlow-optimized silicon will be much easier to invest in down the road.
Today’s networking silicon is diverse for a variety of reasons. OpenFlow Switch 1.0 was simple and supportable across a wide variety of chips because it depended only on basic common features present in most platforms. But newer versions of OpenFlow go beyond those most common features. Many newer OpenFlow features are optional. That broadens the range of what OpenFlow can support, but it also makes implementations more diverse and makes it trickier for app architects to know which features they can count on. This “interoperability challenge” was highlighted to some degree at the most recent ONF plugfest at Indiana University. This plugfest included multiple products with preliminary support for OF-Switch 1.3, and the diversity of implementation has been a topic of discussion. The Testing and Interoperability Working Group (TestWG) is working up a technical paper as they have for previous plug-fests.) The Forwarding Abstractions Working Group (FAWG, of which I’m the chair) is working to mitigate challenges related to platform diversity, though some interoperability challenges, such as management, are outside the scope of FAWG.
Configuration and Management
The Configuration and Management Working Group (CMWG) defines the OF-Config protocol, including OFC1.0 and OFC1.1 (OFC-1.2 is in process at this time), but the adoption of OF-Config is behind that of OF-Switch. Compounding matters, the leading virtual switch, OpenVSwitch, uses OVSDB as a management protocol. ONF has not yet specified a management protocol as part of the OpenFlow Conformance Program, so a conformant product could use OF-Config, or OVSDB or some other management device.
The ONF established some aggressive milestone dates for the CAB, and the members are attacking the deliverables with gusto. The CAB’s first opportunity to “synch up” with other ONF leadership comes on August 7, when the CAB, the Council of Chairs (CoC) and the Technical Advisory Group (TAG) will meet for the first such face-to-face conversation. (The CoC and the TAG have already had several face-to-face meetings; they are very helpful.)
Today’s complex chips require 12 to 18 months to develop; revolutionary architectures are more toward the long end of that spectrum. Add the time required to productize, test, train, etc, and we’re looking at 30 months from project start until real revenue begins. With tens of millions of dollars required to get a new chip to market, firms require confidence that the spec is stable and that demand two to four years out will be big. On the other hand, hybrid boxes based on existing or evolutionary chips cost less, are less risky, and provide a compelling transition story. The CAB will be working to explain how its members view all these topics. I’m confident that the resulting alignment will accelerate the delivery of compelling solutions.
Image credit: File:Cabs.jpg - Wikimedia Commons
In my previous blog, I briefly discussed the Network Functions Virtualization (NFV) movement and the reasons why traditional Service Providers (SPs) are adopting NFV. One of the major qualifications for NFV will be to achieve relatively higher virtualized performance and scalability similar to what is now considered the norm in the hardware paradigm. Among the various trends that will define I/O virtualization, the two most distinct ones for consideration in NFV are:
In the virtualized environment, Layer 2 (L2) performance and throughput matches line rate even at 10G due to the L2 switching performance improvements for the virtual switches. With NFV and in this blog, the performance discussion is with reference to Layer 3 (L3) services - firewall, routing, load balancers, etc. The quest is to find how the higher packet throughput can be achieved when a combination of such services are configured on the virtual networking device, i.e. a virtual machine that supports services such as firewall, routing, load balancing, to name a few.
The performance degradation for L3 services in the virtual environment can be attributed to the components that sit between the Virtual Machine (VM) interface and the server’s hardware Network Interface Card (NIC).
Figure 1: A generic depiction of the layers between the VM interface and physical NIC.
These layers generally consists of the hypervisor’s virtual switches, system level drivers, hypervisor OS drivers, and so on. I/O techniques such as Peripheral Component Interconnect passthrough (PCI passthrough) and Single-Root I/O Virtualization (SR-IOV) are ways to bypass these components and tie the NIC directly to the interfaces of the virtual machine for higher throughput and performance.
Let’s look at SR-IOV as an I/O technique with Intel’s architecture to achieve the target 10G performance for NFV. SR-IOV is defined by a PCI-SIG specification with Intel acting as one of the major contributors to the specification. The main idea is to replicate the resources to provide a separate memory space, interrupts, and DMA (Direct Memory Access) streams per VM, and for each VM to be directly connected to the I/O device so that the main data movement can occur without hypervisor involvement.
In traditional I/O architectures, a single core has to handle all the Ethernet interrupts for all the incoming packets and deliver them to the different virtual machines running on the server. Two core interrupts are required - one to service the interrupt on the NIC (incoming packet) and determine which VM the packet belongs to, and the second on the core assigned to the VM, to copy the packet to the VM where the packet is destined. This results in increased latency as the hypervisor handles every packet destined for the VMs.
Figure 2: SR-IOV yields higher packet throughput and lower latency
To achieve some of the stated benefits in Figure 2, SR-IOV introduces the idea of a Virtual Function (VF). Virtual Functions are ‘lightweight’ PCI function entities that contain the resources necessary for data movement. With Virtual Functions, SR-IOV provides a mechanism by which a single Ethernet port can be configured to appear as multiple separate physical devices, each with its own configuration space. The Virtual Machine Manager (VMM on hypervisor) assigns one or more VFs to a VM by mapping the actual configuration space to the VM’s configuration space.
Figure 3: A physical NIC is carved into multiple VFs are assigned to a guest VM (Intel).
When a packet comes in, it is placed into a specific VF pool based on the MAC address or VLAN tag. This lends to a direct DMA transfer of packets to and from the VM bypassing the hypervisor and the software switch in the VMM. The hypervisor is not involved in the packet processing to move the packet (between the hardware interface and the actual VM) thus removing any bottlenecks in the path.
SR-IOV is supported on most hypervisors such as Xen, KVM, HyperV 2012, and ESXi 5.1. Newer servers and 10G NICs with SR-IOV support is required (and virtualization is a must for NFV). While SR-IOV is one way to achieve high packet throughput, the actual deployment will give rise to discussions regarding hardware support (hypervisor, NICs), traffic separation, etc. Another issue is that even though the physical NIC can be carved into multiple VFs (number depends on the NIC and the hypervisor), there are practical limitations on how many VMs can be deployed to share this NIC. Techniques for VM mobility while moving from one VF on one server to a VF on another server have to be explored. The open question is whether customers will require VM mobility or if we can assume that these VMs are immobile and tied to the SR-IOV server that they are initially deployed on. While SR-IOV promises to deliver high packet throughput, the exact nature of these improvements for L3 services will ultimately depend on a vendor’s software networking architecture or vendor-enforced license throttling.
Like NFV, SR-IOV is a new and exciting paradigm shift. In the months to come, there will be lots of activity to define the solutions and use cases that will lend itself to the deployment of virtual networking software. As NFV gathers momentum, it will be interesting to learn what the DevOps environment will look like; perhaps that will be the topic of my next blog.