Redefining mission-critical storage tiers


Image: John Manoogian III (cc:by-sa)

The traditional model of tiered storage is broken. “Fast and expensive vs. slow and cheap” doesn’t match how people actually use storage.

But there is a better way: service classifications.

A number of years ago I was asked to put together a storage tier strategy for one of my customers. At that time, automated storage tiering as we know it was still years away and I was a little reluctant to take on the project because tiering seemed overhyped back then.

This thing is, I’d never really seen storage tiering deliver on the promises that were made for it. On the other hand, the project offered a good opportunity to review some of the underlying assumptions driving their storage requirements, such as application availability, data protection and data management.

In the end—and much to my discomfort—the customer decided to define the storage tiers strictly based on performance characteristics. This had the effect of locking them into the following architecture:

  • Tier-1—Performance Tier
    • Performance is guaranteed
    • Runs on a frame array
  • Tier-2—Shared Tier
    • Performance is pooled and shared dynamically
    • Runs on a mid-range modular array using 10K RPM disk
  • Tier-3—Capacity Tier
    • Doesn’t care about performance
    • Runs on low end modular arrays using SATA

The result: Both the tier-1 and tier-3 storage systems became overused, with tier-2 lying unloved. This happens all the time, because very few people genuinely understand what their storage performance requirements are.

The risk-averse business units demanded the fastest storage for all their applications, and the cost conscious ones put overly-demanding applications onto the cheapest, slowest tier. This led to budget blowouts on one hand, and unhappy users on the other.

The key lesson: People want storage to be really fast when they’re accessing it, and really cheap when they’re not. They don’t want storage that’s “kind of fast” or “kind of cheap.”

Since then, storage technology has come a long way. Automated tiering and an increased use of storage pools—driven by virtualized workloads—has significantly increased the utilization of a “shared tier” within most data centers.

Recently, however, I’m beginning to see a re-emergence of old thinking. I’m seeing people design storage tiers based entirely around performance requirements. But this time, the tiers look like this:

  • Tier-1—Performance Tier
    • Performance is guaranteed
    • Runs on all-flash storage (in arrays or within servers)
  • Tier-2—Shared Tier
    • Performance is pooled and shared dynamically
    • Runs on a hybrid flash array using automated tiering
  • Tier-3—Capacity Tier
    • Doesn’t care about performance
    • Lives in the cloud

To avoid making the same mistakes as in the past, we need to move beyond performance as the dominant criteria for a storage service catalog. In fact, we need to stop thinking in terms of an ordered hierarchy altogether.

Instead, we should start thinking in terms of catalogs with multiple dimensions, including characteristics such as:

  • I/Os per second (random reads/writes)
  • Average latency (random reads/writes)
  • MB/sec (sequential reads/writes)
  • Mean time to data loss (MTTDL)
  • Recovery point objectives
  • Recovery target objectives
  • Failure domain resiliency
  • Continuous operations
  • Multi-tenancy
  • Security and encryption

Instead of “tiering,” I find it helpful to talk about service classifications for storage, based on different quality-of-service attributes. In this model, storage services have unique properties to support specific objectives—such as compliance, encryption, multi-decade data retention, indexing, and trans-national replication.

Focusing on service levels and business outcomes enables storage services to involve combinations of storage infrastructure within a pool (e.g., different arrays, different disk types, different clouds, or even good old fashioned tape).

Of course, this requires really good communication between IT infrastructure teams, business units, and developers. It also needs effective use of monitoring and infrastructure analytics tools, and a culture that understands how to automate the delivery of service level outcomes.

Or, to put it more simply, moving beyond traditional ideas around tiered storage will require the same foundations for success as that services-driven model we call “cloud.”

Computerworld's IT Salary Survey 2017 results
Shop Tech Products at Amazon