Review: MongoDB 3.0 reaches for the enterprise
MongoDB zeroes in on operations with pluggable storage engines and revamped management tools
Already one of the finest JSON-document databases out there, MongoDB has unleashed an impressive set of improvements with its 3.0 release. Originally, the release was given the numeric designation of 2.8, but after significant feedback from the user base -- whose members claimed that the extent of the new features warranted an integer-level version number change -- the MongoDB engineers rechristened the release MongoDB 3.0.
Though the new features are numerous, perhaps the most significant is the new pluggable storage engine API. Not only does this interface permit third parties to create new storage engines, tuned to specific applications, it allows MongoDB to provide its own storage engines, which are available as part of the release. Both MongoDB engines bring more granular locking and greater concurrency to the database -- in a nutshell, more speed.
Other improvements, ranging from easier query optimization to richer logging, will warm the hearts of DBAs and operators. At the highest level, the MongoDB Management Service, now dubbed Cloud Manager, has been enhanced to serve as a complete management system, capable of controlling arbitrarily large MongoDB clusters from a single console.
Revving the engines
Prior versions of MongoDB offered a single storage engine, MMAP, so named because it employed memory-mapped files. That is, MMAP left all the work of disk I/O, seeks, block caching, and so on to the operating system. MongoDB 3.0 offers both an improved version of MMAP (MMAPv1) and the WiredTiger storage engine.
MMAPv1, a drop-in replacement for MMAP, is backward compatible with MMAP, so existing MongoDB deployments can upgrade to MMAPv1 with no data conversions required. MMAPv1’s biggest improvement over its ancestor is finer locking granularity. MMAP locked at the database level; within a given MongoDB database, clients could write to only one collection at a time. MMAPv1 locks at the collection level, so clients can perform writes on multiple collections within a database simultaneously. The improved concurrency is primarily evident in speedier write operations, as writes are the main requestors of locks. But reads are quicker, as they can be blocked by writes.
MMAPv1 also has improved allocation mechanisms. Because MongoDB is document-based, each document must be stored contiguously. Thus, if a document is updated -- say, a new field is added or the content of an existing field is extended -- the entire document must be rewritten to accommodate the additional space required. Needless to say, document rewriting is an expensive operation on persistent storage, particularly if the document has to be moved, which then requires updating any indexes to the document.
Prior to version 3.0, MongoDB used a padding factor, which specified additional space at allocation time to minimize the need for rewriting. This padding factor was inferred based on prior allocation history and document growth.
Version 3.0 employs "power of two"-sized allocations. Each document is stored in a record whose size is rounded up to the nearest power of two, so padding for growth is always available to a document. Whenever a document resize overflows the current allocated record, the new record allocation is the next higher power of two. This reduces the number of times a document must be rewritten, though admittedly at the cost of extra, unused space per document. In addition, because the record size is a power of two, disk fragmentation is reduced (freed space can be easily reused).
Note: If you know that your database application will not increase document size, you can disable the "power of two" allocation strategy on individual collections.
Inside WiredTiger
While MMAPv1 is the default storage engine for MongoDB 3.0, the new WiredTiger storage engine offers significant benefits in a number of areas. WiredTiger was developed by the company of the same name that MongoDB acquired in December 2014. Its engineers architected the well-regarded (and widely used) Berkeley DB database.
Much of WiredTiger’s performance derives from its carefully engineered internals. For example, WiredTiger uses “hazard pointers,” a lock-free mechanism for controlling multithreaded access to shared objects. In addition, it leverages log-structured merge trees for fast updates, and augments tree access with Bloom filters, which reduce the likelihood of index misses when searching for a key.
Whereas MMAPv1 locks at the collection level (already a significant improvement over MMAP’s locking), WiredTiger locks at the document level. Thus, MMAPv1’s concurrency will generally be better than MMAP’s, but WiredTiger’s concurrency will be better still.
WiredTiger also does a fine job of compressing data. There are three options available: no compression, Snappy (the default), and Zlib. Both Snappy and Zlib compression are provided by third-party libraries.
(You might recall that MongoDB stores its data as BSON -- binary JSON -- documents. BSON is more or less JSON with a few added data types. BSON does not compress its data. Also, because MongoDB documents are self-describing, field names are carried in every document, which means there can be a great deal of duplication among documents in a collection. Therefore, applying compression has potential to significantly the storage footprint.)
The Snappy compression engine was developed at Google. It aims for speed over compression. MongoDB engineers estimate that, for about 5 percent of CPU usage, you get close to 75 percent storage savings. The Zlib compression engine provides better compression density, but at a higher CPU cost.
Note that WiredTiger applies data compression at the collection level, so you can target compression at those collections of documents that require it most. WiredTiger will compress indexes separately (if index compression is enabled) using a process known as prefix-compression, which is a kind of de-duplication. Compressed data is read from disk and decompressed in RAM, while indexes are decompressed directly.
Mixing and matching
As mentioned earlier, while the pluggable storage engine API allows MongoDB to incorporate MMAPv1 and WiredTiger, developers can create their own storage engines to add (or enhance) specific storage features for targeted applications. In addition, different storage engines can be used in the same MongoDB deployment -- even within the same replication set. Replication is transparent to the storage engine.
In a replication set, the primary documents can be stored by one storage engine and replicated documents be stored by another. One could envision the primary cluster members of a replica set running an in-memory storage engine (an experimental version of which the MongoDB engineers are currently evaluating), while the secondary cluster members of the replica set use a persistent storage engine such as WiredTiger. Access to data within the replica set will be rapid because the requests will be satisfied by the in-memory storage engine on the primary cluster members, while the data is safely persisted on the secondary members for crash protection.
Note that the ability to mix storage engines in a deployment also provides a mechanism for migrating a collection from one storage engine to another. Although an MMAPv1 database is compatible with an existing MMAP database, the same is not true of a WiredTiger database. By mixing storage engines in a replica set, you can move MMAP data to WiredTiger in a kind of rolling upgrade, with no downtime experienced by database clients.
Managing MongoDB
While not technically a part of the 3.0 release, MongoDB Management Service (MMS) has undergone so many enhancements that it warrants discussion. Now called the Cloud Manager, this tool has grown from a simple monitoring console into a full-fledged monitoring and management service.
With Cloud Manager, an administrator can not only track live cluster statistics (which MMS could do), but also provision and deploy MongoDB clusters, perform upgrades, and schedule backups. The system is highly automated, so actions from a single Cloud Manager console can control MongoDB clusters of arbitrary size. The Cloud Manager can automatically deploy various automation agents to cluster members, and the agents will do all the legwork in response to Cloud Manager commands.
An enterprise version of Cloud Manager -- called Ops Manager -- is available for installations that want or require an on-site monitoring and management dashboard. Ops Manager executes as a local application (rather than in the cloud, as Cloud Manager does). Cloud Manager is available free as a 30-day trial, after which time it scales back to provide only monitoring features. Ops Manager is available only with the Enterprise Edition of MongoDB.
Other improvements
MongoDB is accompanied by a host of command-line tools. These include mongoimport to import data from external files in a variety of formats, mongoexport to export data from a MongoDB database into JSON or CSV files, mongodump to back up a database by exporting its contents in binary form into external files, and others.
These tools have been completely rewritten for MongoDB 3.0. In their earlier incarnations, they were crafted in C++. Their new versions are written in Go. According to MongoDB engineers, rewriting them in Go has made them easier to maintain and has reduced the size of the executables. In addition, the MongoDB engineers are able to leverage Go’s multithreaded architecture to enhance the tools’ performance.
Previous editions of MongoDB allowed up to only 12 cluster nodes in a replica set. Now, MongoDB 3.0 permits up to 50 nodes in a replica set. Couple this with the ability to configure read requests so that they are served by the nodes that are closest (based on ping timing) to the requesting clients, and MongoDB improves both responsiveness and data resiliency. With MongoDB 3.0’s ability to support more nodes in a replica set, it is more likely that a nearby node can service a request. Further, the loss of a single node now has a smaller impact on client access to data in the replica set.
MongoDB 3.0 also improves the query system. MongoDB’s explain()
method returns a document that describes the plan used to satisfy a query. In the past, you had to run the query first, then execute explain()
to see the winning query plan. With MongoDB 3.0, you can issue explain()
prior to the execution of the query, allowing DBAs to fine-tune a query before carrying it out. (This is particularly handy for long-running queries.) The returned query plan document shows not only the winning query plan, but alternative plans that the query optimizer considered.
You’ll find plenty of other enhancements in MongoDB 3.0. Improvements to geospatial indices support queries over larger areas (documentation claims that queries can now be made on regions that exceed 50 percent of the earth's surface). The increased granularity of the logging system lets developers tune the system at runtime so that specific components emit more verbose logs, making it easier to track down problems. Further, DBAs can now configure audit logs to capture any operation in the database. In the past, only administrative actions could be logged.
I could go on, but the best thing to do at this point would be to head over to the MongoDB website, download a copy of MongoDB 3.0, and experience the enhancements for yourself. You’ll find a scale-out, NoSQL database that is beginning to offer as much flexibility to operators as it does to developers.
This story, "Review: MongoDB 3.0 reaches for the enterprise" was originally published by InfoWorld.
Copyright © 2015 IDG Communications, Inc.