Review: Is white-box switching the future of networking?

crystal ball future tech servers
Thinkstock

What if you could manage your data center switches and routers the same way you manage your servers, and cut capital expense costs in the bargain?

That’s the pitch for white-box networking, the move toward open-source network operating systems running on commodity hardware.

To find out what it’s like to live in a white-box world, we tested Cumulus Linux, from Cumulus Networks, running on the AS5712-54X data center switch from Edge-Core Networks, a spinoff from Accton Technology.

The AS5712 has the same guts as a lot of other 10G/40G Ethernet top-of-rack switches, and Cumulus Linux supports most of the right data center buzzwords: VxLAN tunneling, equal-cost multipath (ECMP) routing, and multichannel link aggregation groups (MLAG). Later this year, Cumulus says it will support multiprotocol label switching (MPLS) and Virtual Routing and Forwarding Lite (VRF Lite).

Although we ran an extensive battery of performance tests, our main interest was to learn how white-box networking differs from the experience with data-center incumbents such as Arista, Brocade, Cisco, Juniper and HP.

While performance is a wash, we found big differences in price, configuration, management, and usability. Linux-based networking systems offer more power and control than proprietary alternatives, but the Linux learning curve can be steep. We don’t think white-box networking is ready for campus networks. But in the data center, especially given the cost savings, we think a growing number of network engineers will find white-box networking worth a look.

Here are 10 lessons we learned while evaluating white-box networking.

1. WHITE BOX WINS ON PRICE

Physically, the AS5712 looks like lots of other data center top-of-rack switches. Built around Broadcom’s Trident2 switching silicon, it supports up to 72 10G Ethernet interfaces, or 48 10G Ethernet and six 40G Ethernet uplinks. We tested it both ways, each using low-cost direct-attach copper cables (DACS) attached to our Spirent TestCenter traffic generator/analyzer.

Excluding DACS or other transceivers, Edge-Core sells the AS5712 through resellers at a suggested price of $8,898, including a one-year subscription to Cumulus Linux. Each additional year of licensing and 24/7 support has a suggested price of $999.

Those numbers compare favorably with, for example, a Cisco Nexus 3172, which starts at $14,000, plus an additional $5,000 for layer-3 routing support. The usual disclaimer applies here: Pricing is squishy, as few customers pay list price. Final cost depends heavily on quantity, features, supplier politics, and other factors. But as a starting point, commodity hardware and Linux software list prices begin at a substantially lower point.

2. IT’S JUST LINUX

While network boxes running on Unix-like operating systems are not new, Cumulus Linux is more tightly coupled with Linux than competing offerings.

Cumulus Linux doesn’t just use Linux as a boot loader; it is Linux. Based on the Debian distribution, it provides all networking functions through standard Linux tools. Its command-line interface (CLI) is a Bash prompt. It uses iproute2 tools for interface configuration, runs the quagga daemon for routing, and offers automation through unmodified Linux APIs.

+ ALSO ON NETWORK WORLD: David Newman reviews FireEye malware detection software  +

In contrast, many other vendors of Linux-based networking devices extensively rewrite routing and switching functions, and make these available through proprietary CLIs or APIs.

Cumulus says it’s tried to make Cumulus Linux “as open as possible.” Almost all components have source code available under the GNU General Public License (GPL). The only exception is switchd, the Cumulus-developed daemon that deals with the Broadcom Trident2 chip, and that’s because Broadcom’s software development kit (SDK) is closed-source. (Parts of Arista’s EOS also are available under open-source licenses.)

3. IT’S PORTABLE

Cumulus Linux also differs in terms of portability. For all the features of competing network operating systems (including Linux features), they run on, and only on, one vendor’s hardware.

Cumulus Linux runs on multiple vendors’ white-box switches. As of this writing, Cumulus’ 10G Ethernet hardware compatibility list includes switches from Edge-Core (tested here), Dell, HP, Penguin Computing, Quanta, and Supermicro. All these vendors offer switch hardware built around Broadcom’s Trident2 chip. This is similar to the white-box server market, where open-source Linux or BSD operating systems run on x86-based hardware from many vendors.

Portability works both ways. The Edge-Core switch runs any network operating system that supports the Open Network Install Environment (ONIE). Besides Cumulus Linux, this currently includes software from Big Switch Networks, the ONOS project, Pica8, and Pluribus Networks.

Cumulus Linux can be configured and managed through automation and orchestration tools such as Ansible, Chef, Puppet, and SaltStack. Cumulus has sample Ansible and Puppet scripts on its website. Centralized configuration and control is not mandatory; management of a single switch from the Bash prompt works equally well.

Perhaps the most useful way of thinking about a white-box network device is that it can be managed in exactly the same ways as a server. Network managers can make changes on one box at a time at the Bash prompt; or on a few systems using quick-and-dirty shell scripts; or on thousands of switches using orchestration software.

4. SWITCHING SYNTAX IS DIFFERENT

Setting up basic layer-2 switching isn’t hard in Cumulus Linux, but the syntax is different from conventional data center switches.

Cumulus Linux stores layer-2 information about interfaces and VLANs in the /etc/network/interfaces file. Users edit this file to make configuration changes.

In Linux terminology, all members of the same VLAN are in an “Ethernet bridge.” Here is an excerpt from the interfaces file that creates a bridge called br0 and assigns the first 48 switch ports to that bridge:

auto br0
iface br0
bridge-ports glob swp1-48
bridge-stp off

The glob keyword is similar to interface range definitions in other vendors’ switches in describing a group of ports. The swpX designation refers to switch ports. All front-panel Ethernet interfaces are swpX interfaces except for eth0, which is reserved for out-of-band management.

In the Edge-Core switch, ports 49 to 54 are 40G Ethernet interfaces. We can add them to the same bridge, or optionally configure each 40G Ethernet port to supply four more 10G Ethernet interfaces via QSFP+ breakout cables. We tested the switch both ways, and had to edit the /etc/cumulus/ports.conf file to use the switch with breakout cables. The same file allows users to group four 10G Ethernet ports to act as a 40G Ethernet port, again via a breakout cable.

Another difference from most other switches is that layer-2 configuration changes aren’t instantaneous. Instead, users first edit the interfaces file, and then issue the ifreload –a command. This is conceptually similar to the commit command in Juniper’s Junos OS.

Note that this example disables spanning tree protocol (STP). That’s a common default in modern data center network designs, where other mechanisms handle loop prevention and redundancy. These redundancy mechanisms may include Transparent Interconnection of Lots of Links (TRILL) or proprietary variants such as Cisco FabricPath, as well as ECMP and MLAG.

Perhaps because of these other mechanisms, Cumulus Linux doesn’t fully support spanning tree. While it does do standard and rapid versions of STP, it does not support the multiple spanning tree protocol (MSTP), which provides separate spanning tree instances on a per-VLAN basis.

Confusingly, Linux provides spanning tree through the mstpd daemon, even though there’s no MSTP support. This isn’t a major shortcoming in data center networks, given the move away from STP, but it’s potentially an issue if Cumulus Linux were to move into campus networks.

VLAN configuration is the opposite of the Cisco model, binding interfaces to VLANs instead of the other way around. Here’s an example for a Cisco Nexus switch that creates VLANs 301 and 302 and then configures an interface as a trunk port for those VLANs:

conf t
vlan 301-302
interface e1/1
switchport
switchport mode trunk
switchport trunk allowed vlan 301-302
no shutdown
end

The Linux model does the opposite, mapping interfaces to the bridges (VLANs). This excerpt from Cumulus Linux is functionally identical to the Cisco trunk example:

auto br0
iface br0
bridge-ports swp1
bridge-vlan-aware yes
bridge-vids 301 302

Some HP ProVision/ProCurve curves also work in a way similar to Cumulus Linux, but switches from Cisco, Arista, Juniper, Brocade, and others generally follow the opposite model.

One area of common ground is that Cumulus Linux, like most other switches, configures VLAN access ports on a per-interface basis:

auto swp11
iface swp11
bridge-access 301

white box switching 1b

5. ROUTING SYNTAX IS MOSTLY THE SAME

Cumulus Linux runs Quagga, the open-source routing stack. Quagga implements all major IP routing protocols, including IPv4 and IPv6 versions of BGP, IS-IS, OSPF, and RIP, and its CLI closely resembles that in Cisco IOS and other IOS-like operating systems.

Users run Quagga by enabling zebra and selected routing protocol daemons and then starting the quagga service. Users then can access the Quagga interface with the sudo vtysh command. (Note the use of sudo to become superuser; Cumulus Linux really is “just Linux” since superuser status is required to start and stop processes.)

As noted, Quagga’s command syntax is similar to Cisco’s. For example, these commands will configure OSPFv2 on Cumulus Linux:

sudo vtysh

configure terminal

router ospf

router-id 198.18.0.1
log-adjacency-changes detail
interface swp1
ip ospf area 0.0.0.0

And here are the equivalent commands for a Cisco Nexus device running NX-OS:

configure terminal

router ospf 1
router-id 198.18.0.1
log-adjacency-changes detail
interface ethernet 1/1
no switchport

ip address 192.198.0.1/30
ip router ospf 1 area 0.0.0.0

Similarly, users can see routing status using Cisco-like commands such as show ip route summary and show ip ospf neighbor. Another common trait with Cisco and Cisco-like devices: Unlike Cumulus Linux’s layer-2 commands, configuration changes entered into Quagga take immediate effect.

Where Quagga differs is that users can’t configure IP addresses on interfaces, because it provides only the routing stack. Instead, users define interfaces by editing the /etc/network/interfaces file outside the Quagga shell and rerunning the ifreload –a command. Once back inside the Quagga shell, users can see interface status with the show interface command.

Another difference: Quagga doesn’t directly support command piping, for example to filter on verbose output. That capability is available, though. Users can run Quagga commands from the Bash prompt and redirect them to any tool Linux provides. For example, to see only OSPF routing parameters instead of the entire Quagga configuration, a user could run vtysh -c 'show run' | grep -4 'router ospf' from the Bash prompt.

Output piping is extremely powerful. It makes available Linux tools such as awk, sed, cut, sort, and scripting languages such as Perl, Python, and Ruby (all of which come installed with Cumulus Linux).

6. LINUX TAKES SOME LEARNING

As with any new product (which for many network professionals translates to “anything that’s not Cisco or something that behaves like Cisco”), there’s a learning curve involved. Cumulus helpfully provides a series of cheat sheets that map common commands in Cisco or Arista products to their Cumulus Linux counterparts.

In some cases, Cumulus Linux configuration syntax is easy. Especially for users experienced with Linux or other Unix-like operating systems, the learning curve won’t be that steep. There’s also the argument that if network engineers can learn one environment, they can learn others. For basic tasks, the knowledge involved in learning Cumulus Linux is not any more obscure than, say, learning how Cisco IOS configuration registers work.

Still, employers might balk given the time and cost of obtaining certifications in the various networking vendors’ environments. Employers that paid for those certifications, and that pay a premium for network engineers with certifications, may be reluctant to retool.

Programming knowledge can help, but it’s not a requirement. In the server admin and dev-ops worlds, engineers routinely write scripts of anywhere from a few to a few thousand lines to automate routine tasks. Unlike OpenFlow and some SDN products, where programming is a must, Cumulus Linux deliberately avoids that, offering instead a configuration and management experience that’s more akin to a conventional switch. If users want to interact with Cumulus Linux in a purely programmatic way instead of the CLI, they can – but it’s not a must.

7. LINUX STORES STUFF IN A LOT OF PLACES

Linux comes from the Unix tradition of using many small tools, each doing one job well. Although that’s changing (for example, with the controversial move by many Linux distributions, including Debian, to adopt the systemd master daemon), life with Linux today means living with configuration information spread across multiple files. Cumulus Linux takes some steps to combat this (see below), but users of current releases still will need to look in multiple places for configuration parameters.

Related:
1 2 Page 1
Page 1 of 2
It’s time to break the ChatGPT habit
Shop Tech Products at Amazon