Today's networks generate a massive amount of data. This data is not only generated by traditional network sources such as the counters you might find in an SNMP MIB, but also comes from a vast and diverse number of other sources. In fact, compute, storage, network, security and energy (CSNSE) infrastructure together generate a much wider variety of data types ranging from Key Performance Indicators (KPIs) for hardware and software infrastructure components, applications, log files and beyond. Many of these data sets are organized as time-series, while others are generated in different ways at different layers in an end-to-end application stack. One primary use case for network service providers is data driven co-orchestration of components and functions across layers to provide optimized, scalable and resilient services. In fact, so much data is being generated by today’s networks that even ingesting this volume of data is challenging. As a result several "big data stacks" have emerged on the analytics scene.
It is worth making the goal here explicit: We know that we can analyze the diverse data sets generated by networks using a variety of algorithms that will give us novel insights into the design and operation of end-to-end service stacks. In particular, by understanding the structure of our data sets, we can classify current (and past) events, understand hidden relationships, and to some extent predict the future. Among other things, we would like to use this new knowledge to enable a deeper level of automation, essentially automating what we currently think of as automation. Clearly these are worthy goals, but how might we accomplish this vision?
Enter Machine Learning
Machine learning is a sub-field of Computer Science that is concerned with the study and construction of algorithms that can learn from and make predictions based on data gathered from the environment. In particular, we imagine that the data we observe is generated by a set of processes which are frequently called a Data Generating Distribution, or DGD. In addition to generating the observed data, the DGD may be governed by "hidden" variables which are not explicitly represented in the mapping from input to behavior. As such we will also want to be able to reason about these hidden factors. The task of Machine Learning then is to learn a good approximation of the DGD that can then be used to solve a wide variety of engineering problems involving classification, regression, and generalized prediction.
Machine Learning Is A Key Technology For Network Service Providers
Network service providers, both wireline and wireless, have long been both producers and consumers of large data sets. So it comes as no surprise that behind the backdrop of rapid advancements in Machine Learning theory and practice and ever growing data sets (consider the data sets that might be generated by say, a smart city of just a few million people) that most service providers have kicked off significant Machine Learning activities. One of these activities, CogNet, aims to use data driven approaches (i.e., Machine Learning) to build intelligent systems for 5G network management and security. CogNet is one of the 19 projects recently launched by the European Commission and the 5G Association to implement the Phase 1 of the European 5G Public Private Partnership. Not surprisingly, the CogNet consortium is particularly interested in using Machine Learning for Network Management, Network Security and Integrity, and Virtual Network Platform and Software Networks orchestration and optimization.
I was fortunate enough to attend the CogNet meeting last month after IETF 96 at Orange Gardens, the new state of the art Orange facility located in Chatillon, just outside of Paris (The talk I gave can be found here). What I found there was a group of mobile carriers and vendors who shared my vision of Machine Learning as a foundational technology for many of the components of their businesses going forward. Examples ranged from quasi-real-time data-driven optimization of spectrum to "smart VNFs" which use data from their environment to optimize their performance and security, to network and service aware orchestration that can self-organize and self-optimize in a date-driven manner to react to ever changing network conditions.
Brocade's View of the Intelligent Network Future
Perhaps the most interesting part of my visit with the CogNet crew was the commonality of goals that Brocade shares with the CogNet community. In particular, we agree that:
In summary, we at Brocade see Machine Learning as a key technology component of our products moving forward. As such we are aggressively integrating intelligent, Machine Learning enabled components into the infrastructure and orchestration of the end-to-end stacks that we are building.
We are living through the beginning of an exciting intelligence revolution in the network industry. Watch this space for more current events from this exciting frontier.
 It is worth noting that the Machine Learning community has a long tradition of open source, open data and indeed open science.