Another software-based approach to scalability is distributing "slices" of data over many physical databases. Cleversafe's dsNet technology, also sold as appliances, works best with more than a petabyte of storage, made up of objects more than 50 to 100KB in size. This is ideal, says President and CEO Chris Gladwin, for applications such as photo sharing over the Web.
What's Next
As hard drives get bigger and faster, flash gets bigger and more reliable, and open-source storage stacks mature, some industry watchers see fundamental changes in how organizations cope with the data flood.
With the adoption of new nonvolatile memory technologies, the need for tiering data between solid state and spinning disk will diminish as new technologies become cost-competitive with higher-end Fibre Channel and SAS disks, predicts Shetti. Higher-capacity, lower-cost SATA disks will still have a role, but he says the complexity of packaging and different software interfaces will discourage users from mixing nonvolatile memory and SATA in the same system.
Within three to five years, the price of flash drives will be somewhere around the same cost as high performance disk, says Hu Yoshida, CTO at Hitachi Data Systems. They are already at parity, he says, when the capacity of the hard drives is reduced by short-stroking (using only part of the disk capacity to speed performance by reducing the distance the read/write heads must travel to reach the data) and by writing data across multiple disks in RAID data protection configurations.
Even commodity hard drives, however, will gain speed as vendors add more cache to them. Seagate expects such "hybrid" drives to make up most of its product line by the middle of the decade.
Cloud storage services will provide slow but extremely low-cost archiving services to reduce the in-house storage load. Amazon Glacier, for example, costs as little as 1 cent per gigabyte per month. While "it could take three to five hours to retrieve that data," that might be no longer than it would take to restore data from tape stored offsite -- and Glacier would be cost-competitive with tape, says Greg Schulz, founder of consultancy StorageIO.
"Object stores can reduce storage costs and complexity by eliminating the need for hierarchical file systems," says Gladwin. "In a very large data storage system, running a file system [requires] additional racks of servers" that consume power, take up space and cost money. With an object store, he says, an application such as a social media website lets a user search for friends without using a file system.
Meanwhile, IT shops continue to be drawn to the cloud's combination of cost efficiencies, low-cost hardware and low-cost, open-source software.
Constant Contact, for example, is considering "private storage clouds," possibly using open-source software, on the system of a provider such as Amazon S3, for the low costs and "almost unlimited horizontal scale" they can deliver, says Piesche. Using Cassandra, for example, he says he would like to scatter storage clusters among distributed data centers for disaster recovery "without any licensing costs, without any complicated setup and without any manual intervention."
The replication capabilities he needs aren't available yet. But he has to keep looking because, as Schulz says, "For the vast majority of people there's no such thing as a data recession."
Scheier is a veteran technology writer. He can be reached at bob@scheierassociates.com.
Read more about data storage in Computerworld's Data Storage Topic Center.