Driving Storage Convergence

Steven Kleiman

Title: Senior vice president of engineering and chief technology officer

Company: Network Appliance Inc.

Claim to fame: As chief technologist at Sun Microsystems Inc., he helped design the popular Network File System Unix file sharing protocol.

Challenges: Convincing IT that, by adding a block management layer to its architecture, Network Appliance storage pools can serve as the hub of a converged SAN/NAS storage network.

As senior vice president of engineering and chief technology officer at Network Appliance Inc. in Sunnyvale, Calif., Steven Kleiman is the visionary behind the vendor's storage technology agenda. Computerworld's Robert L. Mitchell talked with him about merging the worlds of storage-area networks (SAN) and network-attached storage (NAS).

Steven Kleiman of Network Appliance Inc.
Steven Kleiman of Network Appliance Inc.
What will the most important storage technology trend in the next 12 months? SAN/NAS convergence is clearly what's happening. We have two products that essentially export block-level interfaces. One is SnapDrive, and the other is our DAFS [Direct Access File System] Database Accelerator. We'll continue our architecture with a filer head with Fibre Channel-based interconnections.

Our system has an underlying block management layer that does the RAID layout optimization, and there's a file semantic layer on top of that that does things like create a directory and whatnot. And on top of that are the file protocols. We added a LUN [logical unit number] semantic layer, and that creates LUNs of various sizes and it goes right on top of the underlying block management layer. It uses the same storage and storage pool. Our intent is to try to . . . let the SAN stuff share space with the NAS stuff and take advantage of the array bandwidth that's available.

What about dynamically scalable volumes? That's not easy to do with SANs and Fibre Channel arrays. Will that change in a converged world? There are whole steps that you don't do with our stuff that you have to do with the traditional large-block server approach. The remaining SAN management storage management issues that SAN has are inherent to SANs.

This is one of the reasons why people like NAS. Some applications like the SAN protocols better, and we can deal with that. Personally, I think the NAS protocols, when used, lead to a more efficient use of storage.

Are you saying that host servers should read and write files instead of doing SCSI block transfers? They already do. The question is how this shakes out over time as people get higher- and higher-speed networks with low overhead file access protocols like NAS. We shall see.

Management of NAS boxes has traditionally been complicated by the fact that every NAS appliance must have its own filer head and management interface. How will this change in the future? We can bring down the overhead of managing multiple filers to a fairly small degree . . . but the management of the storage [devices] themselves does not go away no matter what you do.

Today if you actually have to do some management, you do it on a head-by-head basis. Going forward, that will blend. You will see less of a head. Today that's not the case.

How exactly will that blend? In the data center you can solve some of the multiple filer problems with high-speed interconnect technology and try to bring a more scalable filer, if you will. The interconnect for our cluster in our new model is InfiniBand, and that's a clear direction in terms of using these new, high-speed commodity fabrics to build more scalable systems.

Fibre Channel can transport data but control and management information must be routed over an IP LAN, essentially requiring parallel networks to exist. When will this change? We've been pushing toward in-band management as much as we can, and the Fibre Channel community doesn't appear to be going there. My prediction is that it will stay separate for now.

With iSCSI you can route storage blocks over IP instead of Fibre Channel. But is the technology ready? If you look at our DAFS product, it's based on the [Emulex Corp. GN/9000SI] RDMA [Remote Direct Memory Access] over TCP/IP card that uses Gigabit Ethernet. We can get reasonably good performance with [it]. I think we'll be seeing some reasonable TCP/IP off-load cards that are quite competitive with [Fibre Channel host bus adapter] technology.

We're members of the RDMA Consortium, and the goal is to come up with a standard RDMA over TCP protocol in time to make the first generation of 10 gigabit TCP off-load engines. If this all comes to fruition, you will have one high-speed network that does traditional NAS and other communication protocols in an off-loaded way with iSCSI and DAFS all in one card.

Filers are expanding to tens of terabytes, but how do you back them up? A NetApp filer can support eight backup streams to tape. Even with the best tape technology, a 20TB filer would take 40 hours to complete a full backup. How do you get around that? You've come to the same conclusion I came to several years ago, that this is just hosed. The data is exploding way faster than tape is getting faster or bigger. We're addressing that by putting another level in the storage hierarchy with NearStor [disk-to-disk backup]. You should be looking over the next few months and further for a convergence of our filer technology and caching technology.

The goal is to get out of the backup business on a daily basis and make restore transparent, meaning there isn't a long downtime while you go ahead and restore something. I think tape becomes more of an archival mechanism where you do a full backup once a month, for legal purposes perhaps.

Reference data servers like EMC Corp.'s Centera create a unique ID for unchanging files, creating an abstraction layer between stored objects and the applications attempting to access them. Using the Centera application programming interface, an application can use this object name and no longer must track the path to the stored file. Will NetApp take a similar approach? I like to think that we're already there. Part of what NearStor is about is giving you low-cost ways of storing archival data.

Using a name based on the content is pretty easy to do, but I don't see a need for it. Most of the archiving mechanisms are done through applications like Documentum and Filenet, and truthfully they're the ones who should say what the underlying storage requirements should be. If there are specific enhancements that are needed for data integrity, which we don't believe right now, we can add those pretty easily. But I don't see it yet.

An object-based file system is certainly something Microsoft is working on embedding in a future version of Windows. The trouble is until that file system interface is standardized and agreed upon there's no point in protocolizing it. It's not embedded in every device like every host or every application server so the applications that do this stuff today seem perfectly happy with the semantics that they've got.


Copyright © 2002 IDG Communications, Inc.

Bing’s AI chatbot came to work for me. I had to fire it.
Shop Tech Products at Amazon