Open source software takes the storage stage

Open-source storage software is available for a number of user needs

Tens of thousands of users are deploying open-source storage software in an effort to avoid pricey proprietary products such as array clustering and disk eraser applications and to get some long-term protection through the availability of source code.

Rafiu Fakunle, CEO of London-based open-source vendor Xinit Systems Ltd., said users have downloaded more than 38,000 copies of its Openfiler NAS and SAN software from the Web site. And Zmanda Inc. in Sunnyvale, Calif. -- the company providing support for the open-source backup software product Amanda -- said that it supports 20,000 users worldwide.

Open-source storage software is available to address a number of user needs, experts say. Amanda is a backup software product targeted at small and midsize businesses that allows the creation of a single master backup server to back up multiple hosts. DBAN (Darik's Boot and Nuke) allows users to securely wipe the hard drives of their computers.

Other open-source storage software includes Lustre, OpenAFS and SAMBA, network file systems used for different tasks. Lustre is used in large-scale cluster computing, while OpenAFS is deployed to create a single file space across all computers so that any computer can access a file on any other computer. SAMBA allows Linux servers to provide file and print services to Microsoft Windows clients.

Integrators like the Network Resource Group (NRG) in Manhattan, Kan., say they can deliver substantial savings for their customers using open-source storage software. Terry Hull, a principal network engineer at NRG, recently put together a VLAN for a client using iSCSI and open-source storage software.

"The incremental costs for the technology were $1,500 for the open-source software versus $25,000 for a comparable configuration from Lefthand Networks and $75,000 for a Dell Fibre Channel SAN," Hull said.

But experts remain skeptical about the wisdom of implementing open-source storage software products. Jacob Farmer, chief technology officer at Cambridge Computer Inc. in Waltham, Mass., has some clients who implemented OpenAFS and Lustre in order to avoid the high cost of clustered file system software from a company like TerraScale Technologies Inc. in Montreal.

Despite Cambridge Computer's successes in deploying open-source storage software, Farmer said, "only those with highly skilled personnel were able to pull it off. The rest found that these products were too complex and had deceptively high costs of ownership."

Key questions that users need to answer before using open-source storage software are:

  •  What is open-source storage software's value proposition?

  •  What products are available for their specific needs?

  •  How stable and scalable are the products?

  •  What risks do they present?

  •  Under what circumstances should an end user consider open-source?

  •  What level of user skill is required to implement and support them?

  •  What software support options are available?

Value Proposition

The three primary value propositions for open-source storage software are:

  1. Minimal or no upfront software costs.

  2. Comparable baseline features as proprietary storage software products.

  3. Availability of source code provides some level of long-term protection.

Open-source storage software can be obtained in one of two ways: freely downloaded from a Web site, or purchased. For example, users interested in trying the open-source Amanda backup software may either download a community edition from the Web site or purchase an enterprise edition from Zmanda's Web site. While the underlying source code should be the same in both instances, Zmanda provides a "sanity check" of the enterprise edition code ensuring that the version that the user downloads and installs is fully tested and compiled at their labs.

DBAN is an open-source storage software product available in both free and commercial versions. DBAN meets the 5022.22-M standards of the Department of Defense for data erasure by overwriting all disk locations three times. Network Resource Group's Hull primarily supports Linux in his clients' environments and said he uses DBAN on a "constant basis to clean hard drives or partitions for my clients."

Other users like David Ritchie, an IT manager at an Atlanta-based staffing firm, still finds DBAN is not quite ready for his environment. With DBAN, which is often used on smaller servers with internal disk drives, Ritchie encountered some quirks when trying to erase data on volumes on external storage. "The amount of storage it displays is different than what is presented by the external storage array and the program runs single-threaded so you need to be strategic in how you deploy it," he said.

The AoE (ATA over Ethernet) protocol provides a method that is comparable to Fibre Channel for users to connect to external storage using common the 1Gbit/sec. Ethernet protocol and network switches. As a registered IEEE protocol, AoE runs at a lower level in the Ethernet stack than TCP/IP so it does not impact server performance in the same way that the iSCSI protocol does yet it provides approximately the same level of performance as more expensive Fibre Channel SANs. Coraid's CEO, Jim Kemp, said, "On a 1Gbit Ethernet link, AoE can achieve 110MB of throughput without burdening the host processor."

However, AoE does have a number of downsides. First, while drivers are freely available for Linux, FreeBSD and Solaris, Windows users still need to purchase an AoE driver such as Rocket Division Software's Starport software. Second, AoE is not a routable protocol so it can not be used to access storage on other segments of the LAN. Third, storage products that support this protocol are only available from a few vendors such as Coraid. Finally, AoE requires newer network switches that provide flow control that maximize throughput and limit network collisions.

The availability and accessibility of the source code is also a major advantage of open-source storage software, especially for organizations that archive data for long periods of time. Charles Wegryzn is a developer at Retriever Technologies in Santa Fe, N.M., which is working on an open-source content management and digital archiving software. Wegryzn said it used to be fairly typical for users to buy software from IBM, which would include the source code inside. "Then Microsoft came along and changed everything. With open-source, we are going back to our roots of how computer software sales used to work."

Cambridge Computer's Farmer thinks archived data is the single largest value proposition for open-source storage software. Supporting proprietary data formats long term and the possibility of vendors going out of business that provide those formats are valid user concerns now. Farmer said, "With open-source, at least you know you will have support in 25 years, since you own the code."

Open-Source's Hidden Costs

Despite the benefits that open-source storage software offers, users need to establish what the hidden costs are, experts say. The major factors that affect the total cost of ownership are:

  • Product installation and configuration documentation.

  • Product support.

  • Breadth of product functionality.

  • Hardware and software interoperability.

One hidden upfront cost with open-source storage software is finding documentation and scripts that ease its installation and configuration. Coraid's Kemp said, "The open-source community is rich in information, but it is a scavenger hunt to find exactly what you need."

Because of these concerns, commercial versions of Amanda, DBAN, and Openfiler available from Zmanda, Techway Services Inc. in Grapevine, Texas, and U.K.-based Xinit Systems, respectively, provide documentation and install scripts for the commercial open-source versions. Protocols like AoE are included with the Linux 2.6.11 kernel or bundled with hardware like Coraid's EtherDrive SR1520.

The costs for supporting open-source storage software show up in different ways. Open-source vendors are in general agreement that managing open source code and changes to it require, as a rule of thumb, administrators with at least two to four years of experience.

"Users who like the idea of modifying open source code need to take a close look at the code to make certain that they can work with it and that it is within their skill set to modify," Cambridge Computer's Farmer said.

Integrators like Hull of NRG also encounter other issues with product support. 

"Getting to the root of a problem when you have open-source layer upon open-source layer is rarely easy and the thing we [NRG] know we are giving up with open-source storage software is a significant margin of management," Hull said.

Another major concern for open systems storage software is the depth of product functionality. Open-source products like Amanda and OpenSMS, a policy-driven systems management storage software product, almost always have certain product restrictions. For example, Amanda will not back up Microsoft Windows hosts unless SAMBA, a file and print sharing utility, is first installed on the Windows host, and Amanda offers no media server option, so all backups must go through a central server. OpenSMS only officially supports Linux 2.4 and 2.6 running on an XFS file system though it suggests it should work on other Unix platforms and, with some porting, on JFS file systems. OpenSMS offers no integration with Microsoft Windows platforms.

The final major concern for enterprise shops is the lack of verifiable interoperability testing between the open-source storage software and other hardware and software products in the user's environment. NRG's Hull notes that while interoperability is not a major concern for over 90% of his installs, he still never discounts the possibility of having to troubleshoot interoperability issues. Cambridge Computer's Farmer said, "Unless the software has comprehensive support services behind it such as Amanda does, one needs a really good reason to mess around with it since primary storage is such a vital piece of the IT infrastructure."

Next Steps with Open-Source

For the most part, open-source storage software is still largely a work in progress that requires users to have years of practical experience as well as the time to research and support the products. However, there are initiatives under way to make it easier and more practical for users to deploy open-source storage software.

First, vendors like Zmanda are helping to make a product like Amanda a more viable option for the average user. One of Zmanda's goals for the next 12 months is to simplify the install, configuration and management of Amanda so that it can be set up and managed by a novice or entry-level administrator.

Second, open-source projects like Aperi are creating a standards-based, open-source software framework to manage storage networks. Standards like the Storage Management Initiative Specification (SMI-S) which defines a method for interoperable management in heterogeneous SANs that is now included with most storage software but provides users with no software to manage storage devices. Aperi goes the final mile and provides users with the needed open-source storage software to manage storage that supports SMI-S.

In the meantime, Farmer offers this advice:

  1. Find an open-source software where there is a large open-source community, and make sure that you have the skills and time to modify and manage the code.

  2. Go with a low-cost solution with an easy way to migrate your data out if need be.

  3. Don't be afraid to pay an enormous premium for a big-name vendor to avoid the risk.

The following is a list of open-source storage software Web sites:

Copyright © 2006 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon