Don't use server benchmarks to measure storage performance

To my surprise, I have been seeing storage performance results reported using server benchmarks and their associated server metrics. Server benchmarks are specifically designed to measure the maximum performance of the server -- NOT the storage.

Even if external storage is used in the benchmarked system, server benchmarks entail maximizing the performance limits of the server's processing power.  Often times, in cases where storage performance is measured by a server benchmark, the server and OS are not even tuned or optimized, and the storage is not offered the full load it can handle. This means that storage performance is ‘left on the table,’ and the primary metrics reported are skewed. These skewed metrics do not reflect the true performance of the external storage array.

For example, VMmark benchmark was specifically designed to measure how many virtual machines under active business application load a server can handle. If the I/O activity is measured during this benchmark, it will show that the I/O load is light-to-medium on the average external storage array. That means even though the server running the benchmark may reach its maximum performance, the storage could handle more I/O activity.

In some cases, the storage could handle several servers under similar loads. Unfortunately, the single server performance results are what gets reported, which is nowhere near the full performance capacity of the external storage array. This is especially tragic when more preformat all flash arrays are involved, and the true performance value is not reported correctly.

To accurately measure storage performance, the backend storage must be the targeted resource for maximizing storage performance, not server processor power. That is why storage engineers have specifically designed storage benchmarks to maximize and accurately measure the performance of storage systems. Most shared storage arrays have the capacity to handle several heavy workloads from many servers at a time.

The workload patterns delivered to a storage array will be a blended effect from many workloads. The data content and the type of I/O load delivered to the storage under a benchmark have to reflect these blended patterns. This is not even accounting for simulated moving hot spots, skew, non-uniform access, and caching effects that are designed in more complex storage benchmarks.

To accurately measure storage performance, use a storage benchmark that has been specifically designed to accurately measure the maximum performance of storage. Server benchmarks just don’t do the job.

This article is published as part of the IDG Contributor Network. Want to Join?

To express your thoughts on Computerworld content, visit Computerworld's Facebook page, LinkedIn page and Twitter stream.
Windows 10 annoyances and solutions
Shop Tech Products at Amazon
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.