It has been a decade since the first version of the Network Data Management Protocol was launched as an effort to solve the problems posed by the backup and recovery of network file servers. Initial work on the standard was spearheaded by Intelliguard Software (subsequently rolled into Legato Software and EMC Corp.), which produced storage management software, and Network Appliance Inc., which manufactures network file servers.
The standard was developed to address the fact that network file servers are not able to use the storage device drivers designed for general-purpose computers. They are specialized appliances that connect to a network and are optimized to perform a single set of tasks. Their files are usually mounted by general-purpose computers through protocols such as the Unix/Linux Network File System and Microsoft Windows Common Internet File System.
More
Computerworld
QuickStudies
Without NDMP, there were two choices for backing up network file servers. One was to mount their file systems onto the file system of a computer across the network and do the backup there. The downside was that backup and restore required network and server bandwidth. Moreover, the added complexity made it difficult to use optimized aspects of the network file server, such as Network Appliance's Snapshot capability.
The other option was to write driver software for each type of network file server and locally attached storage system (tape drives, jukeboxes, CD-ROM writers). That required vendors (manufacturers of network file systems and storage systems and/or backup control software houses) to produce multiple driver variants.
The advantage of NDMP is that it establishes a single set of interfaces between the three components involved in a backup or restore operation -- the software controlling the backup or restore, the source medium and the destination medium. When all the components are NDMP-compliant, the manufacturer of each can concentrate on maximizing the efficiency of its side of the interface.
By 1999, the time for backing up an Oracle database residing on one of Network Appliance's network file servers had been reduced from hours to minutes. Instead of mounting the network file server's files to the computer acting as an Oracle server, the backup was done locally on the network file server and used Network Appliance's Snapshot files, which allow for live backup of a consistent disk image.
The paradigm for NDMP is a client/server architecture in which data producers and consumers are thought of as servers or service providers, and the backup control software, which starts, stops and monitors backup and recovery, is thought of as a client. There is one client per NDMP session. There can be multiple servers. In NDMP documentation, clients are also sometimes called data management applications, and servers or service providers are called data service providers (DSP).
A DSP such as a network file server produces a data stream when it provides data to a storage system for backup. It consumes data when a storage system provides it with data for a restore. Data Replication
This agnostic view of whether a data service is a producer or consumer lends itself to data replication. One storage system can provide a data stream that is consumed by an identical storage system, and the data is copied from one system to another.
In the original versions of NDMP, only one data stream was allowed in the transaction between producers and consumers. In Version 5, which is in the proposal stage this year, that requirement has been loosened with the invention of the Translate Service, which sits between producers and consumers and can multiplex data streams. Although it may open up the possibility for all kinds of intermediate translation, its immediate goal was greater efficiency, allowing the faster side of what had been a single producer/consumer pair to chew data from several sources at once.
In an NDMP session, there is always one TCP/IP connection between each service and the software that centrally manages the network's backup and recovery operations, which is the data management application. NDMP is geared toward facilitating centralized control of backup and recovery operations. The client initiates contact with services via a well-known TCP/IP port and then follows up with a standard command-and-response dialogue, which is effectively a state machine, with the state maintained on the client. The data services are moved through states with names such as "Idle," "Listen," "Active" and "Halted."
Although the basic paradigm for all communication, both control and data, is via TCP/IP, the door is left open for services to realize local efficiencies, such as when a backup device is attached locally or if a system happens to be on a high-speed storage-area network. Up through Version 4, there were several standard network configurations for NDMP backup and restore sessions. In one, the client sits on a server of its own and commands a network file server to back up to a locally attached storage device. In another, the client again sits on a server of its own and commands a file server to back up, but this time to a storage device located elsewhere on the network. The standard configurations for restore are identical, except the data flow goes in the other direction.
Version 5 is concerned with Internet issues, such as security authorization and networks that exist across the Web (which is one of the reasons the NDMP working group has migrated from the Storage Networking Industry Association to the Internet Engineering Task Force ).
SAMPLE ARCHITECTURES | |
|
Matlis is a freelance writer in Newton, Mass. Contact him at .
See additional Computerworld QuickStudies