First look: Amazon RDS gives you a MySQL server in the cloud

Amazon's Relational Database Service (RDS) creates a MySQL database server in the cloud. The notion of "database as a service" is already a reality, thanks to products like FathomDB. Amazon itself provides EC2 instances running MySQL, and Amazon's SimpleDB service offers capabilities similar -- but not identical -- to Amazon RDS.

So why RDS? In particular, why choose RDS over SimpleDB?

SimpleDB's very name gives the best reason. SimpleDB is intended to be used as a simple database. SimpleDB stores tuples -- attributes and values -- arranged as rows identified by an ID field. It's excellent if you need basic read, write, and query capabilities. It is not, however, a relational database.

[ NoSQL databases may have a better way of storing data for your Web app. See the InfoWorld Test Center review of Amazon SimpleDB, CouchDB, Google App Engine, and Persevere | See also InfoWorld's top 10 emerging enterprise technologies. ]

Very well, you say, then what of an instance of EC2 running MySQL? Doesn't that provide the same functionality as RDS? True, but RDS pares the components to their MySQL essentials. Unlike an EC2 instance, RDS requires no operating system configuration or management. Neither do you have to work out the details of connecting your EC2 instance to EBS (Elastic Block Storage), nor worry over issues of backing up the EBS holding your database.

Put simply, if a MySQL database is all you want, RDS is just that -- nothing more, nothing less. It's that last part -- nothing less -- that's the real power of RDS.

Instant DB Instance From a user's (and developer's) perspective, Amazon RDS is no more than a remote MySQL database. Amazon refers to a specific instantiation of an RDS-based MySQL server as a DB Instance. Amazon provides a Web-service based API for creating and managing DB Instances; the rest can be handled by standard MySQL communication protocols.

When a DB Instance is created, you specify the attributes that govern its behavior and capacity. For example, a DB Instance's Class determines the server's available memory and processing power. (Amazon specifies processing power using a metric called an Elastic Compute Unit, or ECU, roughly equivalent to a 1.0GHz 2007 Xeon processor.) The list of instance classes reads like a fast food menu. Start with a Small DB Instance at 1.7GB of memory and one ECU, work your way up to a Large instance, then an Extra Large instance, a Double Extra Large instance, and top out at the gut-busting Quadruple Extra Large instance that boasts 68GB of memory and 26 ECUs.

Another attribute defined at DB Instance creation time is the available storage, which can range from 5GB to 1,024GB. While Amazon RDS tops out at 1TB databases, that limitation applies to a single DB Instance. Nothing stops you from partitioning your data into multiple DB Instances; effectively, then, the upper limit -- other than your pocketbook -- is 1TB per table.

When you create a DB Instance, you can also define its backup window. This is a time interval within a 24-hour period during which (if specified) Amazon will automatically back up your database. Associated with the backup window parameter is the retention period parameter, which specifies how many days Amazon retains the backup. (Currently, backups can be retained for up to eight days.)

Finally, if you'd rather not depend on daily, automated backups, you can request a database snapshot at any time. And since each snapshot is associated with a unique identifier, you can create a series of snapshots and restore the database to a specific past state. Of course, storage of snapshot data is not a free service.

[ Catch InfoWorld's cloud computing reviews and analysis: Cloud versus cloud: Amazon, Google, AppNexus, and GoGrid | Inside Amazon Web Services | App builders in the sky | Windows Azure Services Platform gives wings to .Net | What cloud computing really means. ]

Working with RDS To use Amazon RDS, you need two things: the command-line tools and a MySQL-compatible client application. The former is provided by Amazon as a set of Java applications, downloadable from the Amazon Web Services site. They handle the management of DB Instances -- creating, adjusting the parameters, deleting, and so on. The latter can be any application that communicates with a MySQL server. Of course, you also need an AWS Access ID and associated AWS Secret Access Key, both of which you receive when you sign up for Amazon Web Services.

Once you've downloaded the command-line tools, you set a pair of environment variables. One points to a file containing your AWS access key ID and secret key ID. The other holds the command-line tools' path. That done, you can create a small instance (1 ECU and 20GB of storage) using a command like:

rds-create-db-instance --db-instance-identifier rginstance --allocated-storage 20 ---db-instance-class db.m1.small --enging MySQL5.1 --master-username rgrehan --master-user-password mypassword --db-name ADBTest --headers

In this case, rds-create-db-instance is one of the command-line tools.

This produces a DB Instance identified as rginstance, whose administrator's login name is rgrehan and password is mypassword. In addition, a database named ADBTest will be created within the instance and have a maximum storage allotment of 20GB.

To connect a client application to the new database, you need its host path. You can get it with the help of the rds-describe-db-instances command, which returns information for DB Instances associated with your account, including each instance's path. If you don't want information for all your account's DB Instances, you can issue rds-describe-db-instances with a specific instance as an argument.

Finally, you have to grant access to the DB Instance to client applications. Do that with the rds-authorize-db-security-group-ingress command, passing it the CIDR-format range of IP addresses that you want to permit into your database. If you have security groups defined for Amazon EC2 instances, you can use those in place of the IP address ranges. RDS will only permit connections from clients in the security group.

From that point on, working with RDS is like working with any MySQL server. Any application or tool that can talk to MySQL can talk to your RDS DB Instance. You can even use the standard MySQL command-line monitor tool to create users and tables, issue SQL commands, and so on. The only restriction is that RDS disallows SUPER privileges, though RDS provides a specialized form of the MySQL "kill" command to fill the hole created by SUPER's absence. (See the RDS documentation for details.)

Once the Amazon RDS DB Instance is created and access is authorized, it can be managed like any remote MySQL database -- such as with the MySQL Administrator tool, above.

If you prefer manipulating DB Instances programmatically instead of by the command-line tools, Amazon has posted a collection of libraries for a variety of programming languages. At the time of this writing, libraries were available for Java, C#, PHP, VB.Net, and Perl. Because RDS's management functions are exposed as a set of Web services, the libraries are basically wrappers around Web service-based remote procedure calls.

I tested the PHP library, which employs PHP's object-oriented features to abstract RDS access into an Amazon_RDS_Client class. All you have to do to enable the library is modify a configuration file so that database objects can find your AWS access key ID and AWS secret access key. Once that's done, there are at least two dozen example programs that you can run in script fashion (launched from the command line) to explore RDS's API calls.

Pay as you go A fundamental principle of Amazon Web Services has always been you pay for what you use. Costs for RDS are therefore wholly a function of your MySQL application's usage profile.

There are several DB Instances classes, and each has its own price-per-hour figure. A Small DB Instance is 11 cents per hour. A Quadruple Extra Large DB Instance is $3.10 per hour. The rest are arrayed in between. Note that you are charged for as long as your DB Instance lives -- even if it is not responding to any MySQL commands. You have to terminate your database to stop the meter. But you don't have to lose your data. You can request that a snapshot be created at termination and restore from that snapshot at a later time. There's a cost for Amazon to hold the snapshot, of course: 15 cents per gigabyte per month.

You must also calculate "warehousing" fees for your data -- that's 10 cents for each gigabyte per month. And I/O requests cost 10 cents per million I/Os. There are also data transfer fees, which depend on direction. All data transfers in are 10 cents per gigabyte; data transfers out are priced on a staggered scale (see the RDS Web site for details). However, if you are transferring data into or out of your RDS instance from another AWS utility -- say, an EC2 instance -- and both are in the same region, then the transfer is free.

Backups won't cost you anything, as long as the backups use no more space than you originally allotted for your database. Beyond that, it's 15 cents per gigabyte per month (the same as for snapshot retention).

Anyone wanting to employ Amazon's RDS had best put together a spreadsheet macro or two to help with the cost estimation.

This story, "First look: Amazon RDS gives you a MySQL server in the cloud," was originally published at Follow the latest developments in Amazon Web Services, MySQL, cloud computing, application development, and data management at

This story, "First look: Amazon RDS gives you a MySQL server in the cloud" was originally published by InfoWorld.

6 tips for scaling up team collaboration tools
Shop Tech Products at Amazon