A waste of space: Bulk of drive capacity still underutilized

Most companies can reclaim as much as 60% of their storage capacity with monitoring and thin provisioning tools

Near the turn of the century, data centers were only beginning to implement Fibre Channel storage-area networks (SAN), with most relying on direct-attached storage (DAS). Data utilization rates were abysmal, with data centers on average using just 25% to 30% of their hard disk drive capacity.

Despite Fibre Channel and IP SAN adoption and the advent of technologies such as thin provisioning, storage resource management, capacity reclamation and storage virtualization, storage utilization rates remain at 40% or lower. In other words, IT shops aren't using as much as 60% of their storage capacity, wasting electricity and floor space.

"Most people I talk to don't even know how many terabytes of capacity they have on the floor, much less what their utilization is. And a lot of them don't even know how they'd measure it if they could," said Andrew Reichman, an analyst at Forrester Research.

Without better use of storage management tools, users won't be able to track utilization and improve it, Reichman said.

Reichman is one of several industry analysts who believe storage utilization today still averages between 20% and 40%, mainly because thin provisioning and storage management applications with advanced reporting capabilities that could point to wasted storage assets aren't being used.

"It's a bit of a paradox," he said. "Users don't seem to be willing to spend the money to see what they have."

For an enterprise-class data center, comprehensive monitoring and reporting software can cost as little as $250,000 or as much as $1 million, and in many cases a full-time employee is needed to manage it, Reichman said.

Rick Clark, CEO of Aptare Inc., said most companies can reclaim large chunks of data center storage capacity because it was never used by applications in the first place. Aptare's main offering, StorageConsole, is used for backup and storage capacity reporting. It can show admins where and how storage is being utilized.

Clark said the main problem in data centers today is that there is no single way to view host servers, networks and storage to determine how efficiently assets are being used.

Aptare's latest version of reporting software, StorageConsole 8, costs about $30,000 to $40,000 for small companies, $75,000 to $80,000 for midsize firms, and just over $250,000 for large enterprises.

"Our customers can see a return on the price of the software typically in about six months through better utilization rates and preventing the unnecessary purchase of storage," Clark said.

In many cases, companies buy more raw disk capacity than they need because the base cost of hard disk storage is pennies per gigabyte. But Reichman and others say that it's a fallacy to think that disk storage is cheap, because it costs money to manage and it eats up data center floor space and electricity.

"I start out every presentation with a slide showing it's a higher percentage of IT spending than any other area. Yes, the dollar-per-gigabyte [cost] has gotten cheaper, but application data growth rates are tremendous," Reichman said. "Saving 100TB of capacity out of a storage environment represents $1 million."

The problem of capacity underutilization is as old as digital data storage itself. Traditionally, business units have asked storage admins for more capacity than an application would need, to ensure that they wouldn't run out.

In turn, those sysadmins added onto the tab by overprovisioning storage capacity to ensure that they wouldn't be the cause of an application outage. The result has been an enormous waste of disk capacity.

Adding to the problem of underutilization is the misconfiguration of storage capacity, where storage is purchased but never allocated to any server. It just sits idle.

A magic bullet?

Over the past five years or so, thin provisioning, or provisioning only as much storage as an application server needs, has been among the most popular technologies to increase storage utilization. Thin provisioning is a form of virtualization where the application servers do not know where the physical location of their storage capacity is and are fed from a pool of disk capacity with a layer of abstraction in front of it.

Thin provisioning applications either automate the provisioning of storage or send alerts to sysadmins to allocate more when a threshold is reached. But it's a relatively small percentage of users in the marketplace who are actually taking advantage of thin provisioning, Reichman said.

Recent research from New York-based TheInfoPro shows that thin provisioning utilization is growing quickly. Anders Loftgren, chief research officer at research firm, said that as many as 50% of Fortune 1,000 companies it surveyed in June now use thin provisioning or plan to do so. The results of the company's survey of 250 Fortune 1,000 and midsize companies are expected to be released in mid-August, Loftgren said.

Loftgren said the results show utilization rates from 40% to 60% at the companies surveyed. Those numbers shouldn't be considered poor, because most companies pad their capacity needs for future growth with up to 30% more capacity than what is currently required, he said.

"These guys have the charter to make sure their businesses are up and running, and no one wants to get caught not having enough storage capacity," Loftgren said. "Of course, you want to get that to be as efficient as possible."

The most popular thin provisioning vendors include 3Par, Compellent and LeftHand Networks (now part of Hewlett-Packard). All major storage vendors today, however, offer some flavor of thin provisioning, which can increase utilization rates to as much as 80% when optimally used.

For example, a business unit might ask for 50GB of capacity for a new database implementation. But by using thin provisioning technology, a storage administrator can allocate just 5GB and then set thresholds that will either alert him to increase capacity or automatically add it to the application as needed.

"EMC and Hitachi and IBM have some versions of thin provisioning, but I've talked to zero users that actually are doing what we think of as thin provisioning with oversubscription. My objective evidence tells me that there are virtually no users of it," Forrester's Reichman said.

"Most storage administrators are using storage virtualization for its wide-striping capability, which increases performance and eases storage provisioning, but not for thin provisioning," he noted.

Adam Couture, an analyst at market research firm Gartner Inc., said another reason users may not be embracing thin provisioning and other storage optimization technologies faster is that most aren't replacing infrastructure because of the recession.

"The overwhelming dictate was to control costs -- no capital spending -- which meant you lived with what you had," Couture said. "And if your array wasn't built to take advantage of thin provisioning, there's no way you can retrofit it."

Slumberland Furniture, a Little Canada, Minn.-based company with 2,300 employees and two data centers, was an early adopter of thin provisioning technology. Seth Mitchell, an infrastructure team manager at Slumberland, said the company began using storage arrays from Compellent with thin provisioning in 2004. The company's main SAN has 62.4TB raw capacity, and the other has 8TB.

Because of online retail sales and financial records requirements, the company has intense I/O needs, so 62.5% of the primary storage resides on high-performance Fibre Channel disk, with the remaining capacity on Serial ATA (SATA) disk.

Currently, the company has a disk capacity utilization rate of 66%. Mitchell estimated that if he were not using thin provisioning, the capacity utilization rate would hover around 30%.

Slumberland didn't choose Compellent solely because of its array's default thin provisioning capability. The company needed it more for its modular design, which allows it to grow in capacity and performance over time.

"We have an environment like many retailers, where you really need to prove everything with numbers one step at a time and start small. With a lot of vendors, you had to start with a larger system, or other vendors had smaller systems that required you to rip and replace when you wanted a larger model," Mitchell said.

Over time, however, Slumberland began using Compellent's automated information life-cycle management feature, which places less used or less I/O-intensive data on lower-cost SATA disk or on tape-based policies.

Compellent's technology works by alerting a systems administrator that a capacity threshold is being reached by an application server. The admin can then allocate more capacity with the click of a button.

In comparison, Texas Christian University in Fort Worth, Texas, rolled out its first SAN with thin provisioning from 3Par only two years ago. Prior to that, the company relied on DAS.

Bryan Lucas, executive director of technology resources at the university, said the school has also been implementing a server virtualization strategy that has helped with storage provisioning and eased management requirements.

To date, Texas Christian has virtualized 30% of its 350 servers, with a goal of eventually virtualizing 80%. The school is adding about five servers per month, and virtualization will help cull the number of physical servers needed.

Prior to implementing server virtualization and SANs, the school literally had rack upon rack of DL380 and DL460 servers, with spare disk capacity on all of them. "We just don't have that today. We have racks of blades and pools of SAN drives," Lucas said.

Like other IT managers, Lucas found DAS easy to deploy and manage -- until the school's server farm began to grow. Then managing the disparate DAS islands became unwieldy and inefficient.

"As power became more of a concern, as was physical space, the direct-attached model didn't scale well," he said. "The problem we ran into is having spare capacity on a box. We're buying that server and disk drives thinking it needs to last three or four years, and two or three years later we're buying more. Thin provisioning solves that problem by default. We've been really pleased with the result."

Lucas said his staff doesn't include a SAN specialist, but managing the network has been relatively easy. Since deploying 3Par SANs, he has reclaimed between 40% and 50% of his disk capacity and reduced power consumption by 10% to 20%.

"You don't have all these extra drives spinning that you're not making use of," he said.

Lucas installed the first 3Par InServ S400 array in early 2008, putting the school's generic file server traffic on it, quickly followed by the school's student e-mail system.

He said the school now has three SANs. Thin provisioning is being used on two. The third SAN is mainly in support of video streaming, which would not benefit greatly from thin provisioning.

Lucas said he was cautious about deploying a SAN at first, believing he would need additional staff to manage it. But his vendor has provided proactive support, and the system has run smoothly without needing any regular tweaks.

"I'm not having to dedicate an [an employee full time] to manage the storage infrastructure. It's just much easier than I expected," he said. "We've been through two years of this, and I'm still not looking around saying, 'I need to hire additional staff to manage this.' "

Reclaiming storage

One other way leading storage vendors have attempted to address poor utilization rates is storage reclamation. Leading storage vendors, such as EMC, Hitachi Data Systems, IBM and Symantec, have come out with storage reclamation algorithms in their management software, which identifies unused data blocks, marks them and then places them back into a pool of available storage.

Symantec's Storage Foundation software suite uses an API to enable automated storage reclamation across storage hardware from several major vendors, including Microsoft Hyper-V virtual storage environments.

The T10 Technical Committee of the International Committee for Information Technology Standards is working to make the API a standard for the rest of the industry that would allow storage management software from various vendors to automatically mark and reclaim unused blocks of data capacity.

"What we've seen over the past two years is an increased focus on sharpening the pencils and utilizing the data storage assets companies have in order to eke out better utilization rates," Loftgren said.

"And we see a lot of interest in that and technologies that are going to help companies lower their overall costs."

Lucas Mearian covers storage, disaster recovery and business continuity, financial services infrastructure and health care IT for Computerworld. Follow Lucas on Twitter at  @lucasmearian or subscribe to Lucas's RSS feed . His e-mail address is lmearian@computerworld.com.

Copyright © 2010 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon