19 Techniques to Control the Chaos in Data Storage

istock 1472634

There’s very little in this world that can’t generate data – anything that has measurable activity will do it. That includes nearly everything in the business environment, and organizations’ demand for that information is insatiable.

But before data can be analyzed and acted upon, it needs to be captured, stored, and organized. For a long time, IT leaders could ignore the rising tide of data because the cost of storage kept dropping dramatically. Problem is, the cost of managing all that cheap storage has been climbing steadily at the same time.

Storage is spinning out of control. According to a study by Primary Data, 51 percent of surveyed IT administrators manage 10 or more different types of storage. One-third scramble to control 20 different storage resources.

“Where you have multiple architectures and solutions that don’t talk to each other, you have deep complexity and inefficiencies,” noted Vish Mulchand (@vishmulchand), senior director of product management and marketing, HPE Storage. “You have, essentially, technology islands, and that’s probably the single largest factor driving complexity. Islands may be great for vacations, but not for data storage.”

HPE teamed up with Computerworld to create a series of articles that we believe is a rich resource for organizations looking to consolidate and simplify their data storage architecture: Reduce Storage Complexity with HPE Storage. In addition, we wanted to reach out to the Computerworld community to get readers’ take on the storage complexity challenge.

We asked storage pros, “How do you increase storage efficiency and reduce storage complexity in the datacenter?” Here is a collective view from the Computerworld readership. You can see from the varied responses that it takes a lot of hard work to make things simple, and a variety of techniques exist, depending on the problem at hand and the specific situation. The good news is that there is significant innovation to address these issues. The challenge is picking the right approach – this requires careful analysis to find the point where the right technology solution intersects with your specific requirements and capabilities.

1: Everywhere you look, latency. Minimize it.

“The key to sorting through anything storage-related in the datacenter is to concentrate on latency, not on IOPs (I/O operations per second). IOPs can lie, but latency never does,” said Edward Haletky (@texiwill), managing director, The Virtualization Practice. “If you can measure latency at all levels, you can pinpoint an issue where latency increases.”

Haletky suggests you look across your storage environment, at every possible latency point (e.g., your array, host, VM). Each layer adds latency and possible complexity.

One latency saving technique, suggests Mike Chase (@dinCloud), CTO, dinCloud, is to perform traditional mechanisms, such as encryption, compression, and de-duplication, before they hit the disk (in-line) versus once they’ve been saved to the disk (on-line). If you wait until they are on the disk you’ll increase system load, as such procedures require further manipulation to increase efficiency.

2: Location, location, location

“Considering location when creating storage helps streamline interconnection for your hybrid cloud strategies,” recommended Chris Sharp (@DigitalRealty), CTO, Digital Realty. “By identifying co-location environments to house your storage in close proximity to public cloud compute nodes, [you] can make a connection via a cross-connect to the public cloud, delivering extremely limited lag time without the need for expensive network tethers. With this approach, the overall architecture requires a lot less components to be successful, significantly reducing complexity and reliance on the Internet.”

3: Move storage and its management to the cloud

Jaspreet Singh (@Jaspreetis), CEO, DRUVA, proposed a ‘nuclear’ option for on-premises IT.

“Eliminating the datacenter altogether is the ultimate means to reduce its complexity,” said Singh . “Moving storage to the cloud eliminates the need for hardware and all the management and overhead required for its infrastructure.”

IT consultant and blogger Ian Apperley (@ianapperley) of whatiswellington said most of his clients are completely bypassing infrastructure, moving straight to SaaS solutions.

“Optimizing efficiency and reducing management complexity of storage is a key feature of today’s cloud,” added Eric Sarault (@esarault), product manager, Internap.

Cloud nirvana doesn’t come about without some work. “Public cloud use introduces its own challenges, such as data migration, performance, security, and latency,” admitted ClearSky Data CEO and co-founder, Ellen Rubin (@ellen_rubin), who suggested that administrators should take advantage of managed services to overcome these roadblocks.

4: Software rules the world and your infrastructure

“Adopt a software-plus-commodity-hardware approach,” advised Avinash Lakshman (@hedviginc), CEO and founder, Hedvig. “Built on the distributed-systems principles pioneered by web-scale companies, software-defined storage infrastructure brings a cloud-like simplicity into the private datacenter.”

“The benefits from a software-defined, services-based approach are significant, including reduced hardware costs as you now buy inexpensive, commodity storage,” said Tim Cuny (@OptimizeWithCMI), VP of solutions, CMI.

“This architecture removes the inefficiencies present in legacy storage technologies and greatly simplifies storage management because regardless of the data type, there is a single way to provision, manage, replicate, and repurpose these storage resources,” said Mark Lewis (@ml62), CEO, Formation Data Systems.

The true benefit of a software-defined infrastructure (SDI) approach is that it “allows you to change your perspective from managing efficiencies in cost to flexibly delivering efficiencies in time-to-value for your business,” added Cuny.

5: Flash: you know it’s the fastest storage

“Your most precious asset in a datacenter – besides the data itself – is time,” said Rob Peglar (@Peglarr), VP of advanced storage, Micron Technology.

“Modern applications require extremely fast I/O, and flash and SSDs have reached sufficient capacities at a low enough price to be considered as a primary storage medium,” suggested Ravi Kalakota (@liquidhub), lead partner, LiquidHub.

Flash’s advantages don’t stop at the obvious need for more speed. Security firm Armor uses flash for their extreme performance tier. Jim Rexroat (@armor), an L3 storage engineer, explained that flash drives allow for much denser storage. In fact, they’ve been able to almost double storage capacity in the same footprint while greatly reducing heat production and power requirements.

6: A mix of Solid State Drives (SSDs) and HDDs could actually be less confusing

For larger environments that combine archival storage, a Solid State Array (SSA)-only environment, comprised just of SSDs, probably wouldn’t be appropriate.

Chris Squatritto (@Mosaic451), architect of the Americas, Mosaic451, found that a hybrid array helped reduce the complexity of managing data. 

“By allowing the technology to intelligently move data as it learns workloads, you reduce the need for so many storage administrators,” said Squatritto, whose firm used a hybrid array to move a client’s three racks of traditional storage to just a half of a rack.

Deployed for one of the nation’s largest K-12 schools, the new system allowed them to handle all kinds of workloads at a lower cost than a flash-only solution.

“By migrating to a single system, administrative overhead was reduced by 50 percent,” said Squatritto. “And the school district was also able to reduce their hosting costs with their co-location company by 60 percent.”

According to Squatritto, the district conservatively estimated it could recover its costs within four years.

7: They’ve combined shampoo and conditioner in one, why not compute and storage?

“Deploying hyper-converged infrastructure (HCI) is the best way to reduce storage complexity and improve agility and infrastructure efficiency for virtualized workloads,” said Skip Bacon (@sbacon_vmware), VP, products - storage and availability, VMware. “HCI combines compute and storage administration into a common control plane to unify management and administration of virtual machines and compute and storage infrastructure.”

Be careful, warned Ron Nash (@hronaldnash), CEO, Pivot3, noting that “not all hyperconverged vendors are created equal.”

Nash argues that many of the latency issues that could result from intense processing in a HCI can be solved with erasure coding, which eliminates the need for deduplication and compression while providing exceptional fault tolerance.

8: Let your staff be distributed, not your data

Do you really need mini-datacenters at each one of your remote offices? Are they as secure and as well maintained as your central corporate datacenter? What if you were to eliminate all those edge datacenters? That would probably reduce complexity considerably.

“Organizations with multiple satellite offices can simplify their environment by utilizing edge appliances that talk to a central data center. Each edge device is packed with an intelligent storage cache, virtualization, and WAN optimization, which enables IT teams to centrally protect their data assets, recover from outages, and provision new services and sites across the business at a fraction of the time it used to take,” explained Alison Conigliaro-Hubbard (@aliconig), sr. director of product marketing for SteelFusion, Riverbed. “By eliminating backup and the time it takes to recover from an outage or to provision a new site, a 10-site enterprise can typically save roughly $750,000 over a four-year period.”

9: Solve that darn data duplication problem

“The most common cause of inefficient and overly complex datacenters is data duplication,” said Leo Welder (@CW_LeoW), CEO, ChooseWhat.com.

“Centralize, singularize, and collaborate,” said Steve Prentice (@stevenprentice), senior writer, CloudTweaks.

Prentice is speaking not from the viewpoint of an IT administrator, but as a user who is the recipient of endless iterations of documents, PDFs, and PowerPoint files that travel through the network often as attachments.

“Organizations need to take charge in centralizing documents in a collaborative environment, where editing can be done in a shared space, and where hyperlinks to a single shared document replace multiple attached copies,” added Prentice.

One ideal solution to ensure happy users and a less panicked IT department is to adopt an enterprise-grade shared drive.

“Policy-driven hybrid cloud enterprise file sync and share (EFSS) solutions provide IT with document storage location choice (public cloud, private cloud, and on-premises) at the user group or folder level, which enables compliance with security, privacy, and data residency regulations,” said Kevin de Smidt (@syncplicity), VP product management and product marketing, Syncplicity.

10: Keep the same data, but use less space

While you may have solved your redundant file problem, your files still have redundant data chunks, often from versioned files. Data deduplication eliminates those redundancies and creates pointers to a single copy of a data set.

“In a typical data center without this feature turned on, there are between 3:1 and 50:1 copies of the same data set,” said Todd Traver (@UptimeInstitute), VP, IT optimization and strategy, Uptime Institute.

The second step to reduce your data’s storage footprint is to compress your data, but that doesn’t come without side effects.

“Data compression does have some overhead, in that it requires CPU cycles to perform both the compression and decompression of data,” said Richard Florence (@Entisys360), principal architect, Entisys360.

11: Move data deduplication and compression to the storage array

“A great way to increase efficiency of storage within the data center is to invest in a storage array which includes data de-duplication and compression within the array itself,” said Denny Cherry (@mrdenny), owner and principal consultant, Denny Cherry & Associates Consulting. “Typically dedupe and compression at the storage array is more effective than at the application level, as the array is able to compare data across LUNs (logical unit numbers) for deduplication.” 

Cherry’s testing has shown that compression at the application level is about 30 percent slower than at the storage array.

12: Stop backing up everything

“I’ve frequently found that many data centers back up everything, without giving much thought as to what data they’re storing. That leads to unnecessarily large and complex backups,” said Zac Cogswell (@wiredtree), CEO, WiredTree. “You don’t need four different backups of the same system. There are certain files on that system – for instance, months-old access logs – that most data centers probably don’t need to restore in the event that something goes wrong.”

13: Embrace the Ethernet

In the not-so-distant past, the need for more storage and more access to it gave rise to the Storage Area Network, or SAN, which required fibre channel networks with fibre channel switches to connect servers to SANs, explained Adam Stern (@iv_cloudhosting), founder and CEO, Infinitely Virtual.

Luckily, with time, the network has simplified.

1 2 Page 1
The march toward exascale computers
View Comments
Join the discussion
Be the first to comment on this article. Our Commenting Policies