Back to the future, or how I was 10 years too early
Every year I try to look into my crystal ball and anticipate the new big things and latest trends in storage in order to prepare for the coming year. My goal is to keep those in the storage trenches informed so they can cultivate the correct skill sets as they introduce new technologies into their storage infrastructure. In order to get ready and prepare myself for this considerable task, I usually have to do a ton of research into what is currently in the labs across the globe which may be disruptive. Then I get the temperature of the market by attending all the key trade shows in order to determine where the latest buzz is coming from.
It's also always entertaining for me to take a look back at my previous predictions, so I decided to go back and re-read my previous attempts at foreseeing the future direction of storage, and review my hits and misses.
I was surprised when I reread my predictions for the storage industry for 2004. I smiled to myself as I read the term "programmable storage", which was a term I used back then. It described the convergence of technologies that would enable automated data centers and storage management through software, and it focused on applications rather than specific components or hardware elements. Interestingly enough, all I needed to do in order to make my 2004 predictions relevant as a forecast for 2014 was replace the term "programmable" with "software defined", and I found my forecast from 2004 was almost spot on, but it was a decade ahead of its time. The good news is the industry finally caught up in 2014! Since my 2004 forecast was before my time here at Computerworld (it was published previously, but you'll have to register to see it there), I will provide an overview here...
Storage Predictions 2004: The year of Programmable Storage
Perhaps I should clarify what I mean by that. Let’s start by looking back at 2003.
During 2003, companies were busy saving money by consolidating their storage and server environments. The move was toward Utility Computing, with grid computing being an enabling technology to that end. Many companies took advantage of new storage technologies like 2Gbit fibre channel and ATA based disk devices. Many consolidated their production applications onto larger, more fault tolerant mainframe class storage arrays. As the total amount of data grew, many companies also redesigned their storage plumbing by adding director class core switches so more ports could be added to grow their existing fabrics larger. To ease management and add data replication capabilities, WAN plumbing was added to connect all their current SAN islands together. Many were busy creating standards for new storage requirements, and figuring out how the new regulatory requirements affected them, putting plans in place to satisfy those requirements.
2014 Comment: All I can say is wow. This all came true. Utility computing became the cloud, and WAN plumbing for SAN islands became converged infrastructure.
During the latter part of 2003, large companies upgraded their existing SAN infrastructures to the latest generation hardware, and standardized their environments by limiting the number of vendors that participate in their SANs. Using a standards approach for new storage purchases drove down overall costs, but many found it limited their ability to take advantage of new innovative technologies being delivered by small startup companies. So what is going to happen in 2004 to solve these issues? In 2004, we will see the beginnings of a migration to Fabric Based Intelligence. Packet/Block Level Virtualization, and Object Model Data Management. These "three pillars of growth" in the storage industry will enable what I am calling "Programmable Storage". Now Let's look at how the convergence of these three technologies allows this to happen:
2014 Comment: Can anyone say modular POD based data center building blocks? vBlock, FlexPOD, PureSystem, etc... Fabric Intelligence becomes software defined data center, Packet/Block level virtualization and object model data management become VMware and Hyper-V, and erasure coded disk.
This makes me think I should really do an update this year, and then wait for 2024 to see what happens! if only I could pick stocks like this...
You can read the details of my predictions on the next page.
2004, the year of programmable storage
Fabric Based Intelligence
As switch vendors continue partnering with innovative startups, and virtualization standards come to fruition, we will see companies start to implement the first real products for storage virtualization at the fabric level. The first products in this area are already available today in the form of application specific appliances that monitor the data flow through a fabric, and either route the data to a specific storage device, or replicate the data and route the replica to a new location for disaster recovery. Another example is multi-protocol switching, which enables SAN/NAS/and TCP/IP consolidation.
In 2004, you will be able to connect to storage no matter what protocol your application happens to use. Many storage vendors are busy creating solutions that allow for SAN/NAS and multi-protocol convergence by providing intelligent solutions that include not only Fibre channel, but also NFS, CIFS, IP, and iSCSI connectivity. As an example, I found a few companies that actually migrated the entire core of their SAN to an iSCSI solution. Using intelligent iSCSI gateways as the core fabric switches between the HBAs on the hosts and their fibre channel based storage arrays, these companies leveraged their hardware and experience with IP to help them save costs. The ability of these new faster intelligent switches will allow them to crack the FC packets to examine their contents, which enables the creation of content based routing policies.
Packet/Block Level Virtualization
Forward thinking IT departments do not just want to provide low cost storage solutions to their users. They want to provide different cost structured pools of storage that have different service levels assigned to them. They want to provide the right storage, based on the requirements of the applications. Wouldn't it be cool if all your data magically migrated itself to different classes of storage based on its age, frequency of access, or even regulatory needs? Virtualization allows for the creation of pools of storage from multiple vendors, using combinations of high performance storage for production applications, and lower cost pools of storage (ATA disks, tape or WORM gear) for backup and data retention.
Packet and block level virtualization, in conjunction with intelligent fabrics, will eliminate the headaches of data migration, reduce reliance on proprietary storage based data replication solutions, and allow the creation of business policies that can be enforced at the individual data block or packet level of the data stream. Block level virtualization will enable thin provisioning, where every server thinks it has a huge pool of storage available to it, but the server is only allocated what it needs.
Thin provisioning will eliminate the need to manage storage growth needs at the server level. We will see virtualization solutions in the form of both hardware and software, and they will be implemented at all three layers of the SAN (The host level, fabric level, and the storage level.
Object Model Data Management
Everything needs to be managed from the perspective of the application. Everyone wants the ability to monitor the data stream, and be able to create a policy that automatically eliminates stale, useless, and non-essential data. They also want to try and capture information about the content of all that data, so they can make better business decisions about where the data should be stored. In order to store information more efficiently, there are the questions that need to be asked:
What data is frequently accessed?
What data is stale, and is a good candidate to be moved to archive type storage?
Are there hidden gold nuggets of customer information that can be gleaned from the content of my files to help drive new business?
What data-types should be stored on expensive high-end storage, and which on tape or cheap disks.
What are the performance metrics for which data types?
When was everything last backed up?
Do I need to actually back everything up, or can I get rid of a lot of stale data?
Which data is mandated to be kept by regulatory requirements?
What are the properties of specific storage arrays?
What methods are available within a particular storage array?
Since intelligent data management software means being application aware, the right management software not only needs to be able to manage and monitor storage, but also the entire storage path, the servers, the client network, and the applications themselves. Wouldn't it be cool if you could use a single web console to monitor and manage everything in the data center? In 2004, we will finally start to see the benefits of standardization efforts driven by the storage networking industry alliance (SNIA) and the Internet engineering task force (IETF). Common information model (CIM) management and web based enterprise management (WEBM) will enable IT departments to create application process flow policies for their data.
New technologies and tools becoming available to help you gather useful information about your data, that can help you manage your storage more efficiently. The SNIA is currently working on the mechanics of using object based storage methods, where metadata about the data you store will allow you to manage that data better. Let's face it, data has a life cycle. The older it gets, the less important it is.
Wouldn't it be nice if as the data you store gets older and less accessed it can be dynamically and automatically migrated to different tiers of storage devices, based on the policies you create? Wouldn’t it also be nice to never have to worry about government regulations regarding your data, since the intelligent SAN policies you create will automatically take care of migrating that data to WORM like storage devices?
This all becomes possible in 2004 as vendors conforms to the SNIA storage management initiative (SMI-S), which takes a page out of application programming with classes, objects and methods that can be interrogated through CIM/SOAP and XML solutions. The combination of intelligent fabrics, file systems, storage arrays and application aware management software will let you create "programmable storage". XML metadata tags that describe data can be stored with it. Policies can be created to manage different data types. Storage can be classified by properties, such as performance or reliability. Methods than can be invoked through the management software (hardware based snapshots will be a method).
The advent of programmable storage will enable a complete paradigm shift in how IT departments handle data resources. The hard work (which is policy definition and creation) needs to be done by you. So start defining your policies today. Begin by benchmarking your storage arrays to get the performance metrics that can be included in your policies for data placement. You will need to take a closer look at your current business processes, to make sure they tie in seamlessly with your data management policies.
The end result is a completely automated data center that can be monitored via a single console, and managed by application rather than specific components.
Happy New Year,
And now Happy 2014 too!
This article is published as part of the IDG Contributor Network. Want to Join?