I recently worked on a project that brought up some interesting questions. The customer was building a Hadoop cluster and wanted to test performance differences between traditional Ethernet protocols and OpenFlow. While this sounds pretty normal, the conditions surrounding the project made it a bit more complex. I worry that this post may expose how much I have yet to learn about OpenFlow, as well as Hadoop, but I’ve never been shy about putting myself out there. How can you learn anything if you aren’t brave enough to admit you don’t know everything. So, in the comments section, I encourage you to write what kinds of similar experiences you’ve had and if there were any resolutions or solutions that you came to.
The set-up starts off pretty straight forward: 40-node cluster, standard replication in triplicate, and top of rack switches. The details of the servers, drives, processors, or even performance ratios were not provided. I was basically given a picture of what looked like a cabinet of servers that had lines coming out both sides of them to two different top of rack switches. Those switches were uplinked to a legacy core switch on one side and the other side ended in an arrow pointing somewhere off the design. The question was, “Can we do this?” First, I have to explain that as a subject matter expert in Big Data, you get blind-sided by the strangest ideas at least a few times a week. The only benefit my background gives me is a place to start researching. Sadly, many think that the title gives me super-human ability to fire authoritatively from the cuff on ideas that have genuinely never been tried before, with no empirical evidence, with only a minute or two of lead time. I wish it did. I’m not complaining, after being an SE for 8 years, I enjoy the pace, diversity, and pressure. So, back to our story…
The two sides that were later explained to me were separate networks overlaid on top of each other. The left side of the server stack was NIC #1 (every server was dual-NIC’d, 1GbE, no IP binding) and it was connected to a normal ToR 1GbE switch that had a gateway in a “Stable” network where they would ingest the dataset, allow customer access for queries, and output the results. That was a production environment where downtime was frowned upon (Ok, the customer is a bank). The right line coming out of each server went up to an “SDN Switch” and on out through an “SDN gateway” and whether or not a data-set same in from that side, if there was user access from that side, or if outputs were to go out that network was unmentioned. However, this was the experimental side where they would be “courting risk”. Yes, the same servers are expected to run production for a bank and belong to an experimental SDN network separated only by the PCI express bus on the board. This is where solution designers either break down and start crying or get really creative. I must also point out that the local SE had read a LOT about Hadoop and had prepared a few good looking designs but since this was the first dual-interconnect Hadoop cluster that runs half on SDN, I didn't have anything to compare them to.
The idea of this cluster is to see if the OpenFlow mechanisms could out-perform traditional L2 networking in a loosely coupled cluster. This, in my own opinion, is a very interesting study. Think of the ramifications if this were the case. Companies could dual-NIC or use some kind of dual-mode configuration on all of their desktop machines and when everyone went home at night, SDN could build a cluster with all their idle resources and crunch all kinds of jobs in the evenings. It’s like a dual-purpose IT infrastructure. You could also dynamically link machines together in the wireless or cellular carrier space for crowd sourced cluster computing. Sure, these ideas exist today, like the SETI project, but not on this level, where the device is actually a cluster node for a larger running job. To a solutioneer and closet futurist, an epiphany like this goes instantly to the largest, most grandiose use cases imaginable. I’ll spare you the pie-in-the-sky ideas and get back to how this project ended up and what’s happening going forward.
So, I presented the file system problem that Hadoop cuts up files and stores them on member nodes with an IP address in the header, so it’s network specific. To my knowledge, and I get most of that from the O’Reilly book on Hadoop, is that for best reliability, Hadoop should run in one broadcast domain. Before you jump all over me and point out all the different configurations that you can adjust to have multi-domain clusters, I’ll just say that I know they exist and that “Best Practice” was to have the namenode and the datanodes in the same broadcast domain. This lead to the question, “Well, what would you be testing then?” If the default vLAN had all the addressing for the cluster and you used a protected vLAN configuration to put OpenFlow services in a separate vLAN (even if the interfaces were tagged), wouldn’t all the node-to-node traffic always take the default vLAN since that is where the file system exists? This would, in effect, drain any traffic from the 2x vLANs that you are trying to compare and you wouldn’t have any real data to make your observations of which network is faster.
This lead to a few more design ideas that included adding another distribution layer, IP bonding to the same ToR switch, and a few different multi-mode options before the customer decided to simply build two Hadoop clusters.
While the conclusion sounds anti-climactic, there is a reality here that has to do with pushing the boundaries of technology. Some of the lessons that ring in my ears are: If you’re going to have an experimental network, don’t couple it to something that your revenues depend on. Secondly, and probably more importantly, is that we’re all still creating. We all have to understand that Mr. Cutting wrote Hadoop in 2005 and didn’t have a production installation until 2008. It’s only 2013 and while we believe we can clobber mountains in a single keystroke, we have to keep perspective. All of the big names that define the current software revolution are very new and there’s a danger with trying to implement technology for the sake of efficiency without any concrete, historical evidence. When you do this, you are an innovator and on the bleeding edge. If that isn't where you wanted to be, maybe you should take a long look at what your strategy actually is. There’s nothing wrong with blazing a new path towards greater productivity but you have to know that it’s fundamentally different than implementing a tried and true technology into an innovative business process. The innovator brings different kinds of risk to the equation. If your goal is higher productivity through better use of technology and resources, you must allow that technology to find best practices in the world before you make plans that might not yet be possible.
I’m tracking this one closely as I’m keenly interested in the potential benefit of SDN controlled inter-connects for Hadoop. Maybe someone will come up with an application that injects paths based on traffic flow, size, current resource utilization, power consumption, and a host of other variables that propels cluster computing to yet another level of performance. What I do know is that it should be tested separately!
I look forward to your stories of similar environments where you may have been able to pull it off. Even if you weren't I think we could all benefit from hearing about what you went through. In the future, we will be hosting a Big Data social media page with its own Blog as well as public and private groups for sharing lessons learned and best practices throughout industry.