How I cut my data center costs by $700,000
SmugMug founder gives inside look at life with Amazon's S3
March 30, 2007 12:00 PM ETComputerworld - Amazon.com Inc.’s Web storage service S3 (Simple Storage Service) recently passed the one-year mark. While S3 has gotten many positive mentions in the press, reports from actual customers have been harder to find. At O’Reilly Media Inc.'s Emerging Technology (ETech) conference in San Diego on Wednesday, SmugMug Inc. CEO Don MacAskill stepped up, giving a kiss-and-tell account of his "love affair" with S3 -- the good and the bad.
Hearts and flowers
The Mountain View, Calif.-based photo-sharing site, which competes with the likes of Flickr, uses S3 to host about 192TB of photos.
Founded in 2002, SmugMug formerly ran, according to MacAskill’s blog, "single processor commodity Pentium 4 servers attached to really cheap Apple Xserve RAID arrays" to store all of its photos in-house. Even though they were "high-bang-per-buck hardware" running Red Hat Linux, they were much pricier than S3, which charges 15 cents per gigabyte per month. MacAskill, who is also SmugMug's "chief geek," estimates that the company has saved almost $700,000 in its first year post-switch.
SmugMug’s fast growth -- it was doubling its storage requirements every year -- along with the opportunity to offload hardware management issues and "focus on the application, not the muck," convinced MacAskill to make the change. In its first year with S3, SmugMug spent $230,000 on storage fees, not including the labor cost of transferring existing photos to the new system. That compares with the $922,000 MacAskill figures he would’ve spent on server and storage hardware in the same time period.
S3 has "saved our butts" on occasion, such as the time MacAskill’s brother bent over and knocked out power to tens of terabytes of disks in SmugMug’s data center. Customers didn’t suffer, he says, because access automatically failed over to S3’s version of the data.
Hates and glowers
Not everything’s perfect. For instance, the speed of delivery for data stored on S3 can be slow, because S3 lacks edge caching features standard to true content-delivery networks.
To get around that, SmugMug uses a tiered structure in which 90% of its data is stored on S3, and the most popularly accessed 10% remains with SmugMug. That way, S3 mostly serves as a type of archive or backup site, with almost all requests served up faster by SmugMug’s own servers.
Amazon.com also doesn't offer service-level agreements for S3, though it claims to strive for 99.99% uptime, MacAskill said. Still, SmugMug has experienced at least five performance problems that MacAskill attributed to S3. Two of those were core switch failures, and one was a DNS problem; each of those three incidents lasted less than half an hour. The other two were not outages but brief slowdowns.
"They weren’t a big deal. Everything fails, so you kind of expect it," MacAskill said.
One area where S3 is definitely weak is in customer and technical support. The system currently lacks such useful tools as status dashboards for customers, proactive notifications and the "ability to get hold of a human," MacAskill said. "Amazon is not great at this stuff yet."
MacAskill is a big fan of the REST API used by S3, which he said is so "human readable" that he can sometimes debug problems within a Web browser. He would like to see Amazon.com add a database API to S3, a load balancer and possibly even a true content-delivery network, provided that the price isn’t too high.
Read more about servers and data center in Computerworld's Servers and Data Center Knowledge Center.
Amazon.com Inc.s Web storage service S3 (Simple Storage Service)
Additional Resources



White Papers & Webcasts
Low Administration ROI Tool
Download Now
A Green Architectural Strategy That Puts IT in the Black
Levergage green computing across your data center. Read more now.
Effectively Implementing Datacenter Automation
Effectively select and deploy the best datacenter automation solution today!
Master Data Management Projects in Practice - An Information Difference Research Study
Information Difference conducted a survey of both end-user organizations and systems integrators aimed at gaining deeper insight into MDM implementations and their success...
XenApp Extends Virtualized Application Delivery
Download this webcast to learn how to accelerate delivery of virtualized applications and streamline management.
Open Source Master Data Management: The Time is Right
MDM is a natural extension to data integration and data quality. Open source MDM introduces a new, more accessible approach. It reduces implementation...
Top HPC Use Cases in Life Sciences
Learn from the experts how best to apply cutting edge high-performance computing techniques a life sciences environment.

