Here's how Google and Amazon scale up their huge IT operations
As large IT systems scale to unforeseen levels of complexity, new laws of effective management come into play
IDG News Service - Internet giants such as Google and Amazon run IT operations that are far larger than most enterprises even dream of, but lessons they learn from managing those humongous systems can benefit others in the industry.
At a few conferences in recent weeks, engineers from Google and Amazon revealed some of the secrets they use to scale their systems with a minimum of administrative headache.
At the Usenix LISA (Large Installation Systems Administration) conference in Washington, Google site reliability engineer Todd Underwood highlighted one of the company's imperatives that may be surprising: frugality.
"A lot of what Google does is about being super-cheap," he told an audience of systems administrators.
Google is forced to maniacally control costs because it has learned that "anything that scales with demand is a disaster if you are not cheap about it."
As a service grows more popular, its costs must grow in a "sub-linear" fashion, he said.
"Add a million users, you really have to add less than a 1,000 quanta of whatever expense you are incurring," Underwood said. A "quanta" of expense could be people's time, compute resources, or power.
That thinking is behind Google's efforts not to purchase off-the-shelf routing equipment from companies such as Cisco or Juniper. Google would need so many ports that it's more cost-effective to build its own, Underwood said.
He refuted the idea that the challenges Google faces are unique to a company of its size. For one, Google is composed of many smaller services, such as Gmail and Google+.
"The scale of all of Google is not what most application developers inside of Google deal with. They run these things that are comprehensible to each and every one of you," he told the audience.
Another technique Google employs is to automate everything possible. "We're doing too much of the machines' work for them," he said.
Ideally, an organization should get rid of its system administration altogether, and just build and innovate on existing services offered by others, Underwood said, though he admitted that's not feasible yet.
Underwood, who has a flair for the dramatic, stated: "I think system administration is over, and I think we should stop doing it. It's mostly a bad idea that was necessary for a long time but I think it has become a crutch."
Google's biggest competitor is not Bing or Apple or Facebook. Rather, it is itself, he said. The company's engineers aim to make its products as reliable as possible, but that's not their sole task. If a product is too reliable -- which is to say, beyond the five 9's of reliability (99.999 percent) -- then that service is "wasting money" in the company's eyes.
- Hadoop for Dummies Today, organizations in every industry are being showered with imposing quantities of new information. Along with traditional sources, many more data channels and...
- The Top Five Ways to Get Started with Big Data Despite the increased focus on big data over the past few years, most organizations are still talking about what big data is rather...
- Data Warehouse Augmentation: The Queryable Data Store While organizations have, to date, been busy exploring and experimenting, they are now beginning to focus on using big data technologies to solve...
- The IBM Big Data Platform IBM is unique in having developed an enterprise class big data platform that allows you to address the full spectrum of big data...
- Live Webcast Best Practices: How to Improve Business Continuity with Virtualization VMware solutions include a range of business continuity capabilities to help ensure availability for applications across your virtualized environment. Learn More>>
- Cloud Knowledge Vault Learn how your organization can benefit from the scalability, flexibility, and performance that the cloud offers through the short videos and other resources...
- Endpoint Data Management: Protecting the Perimeter of the Internet of Things Not surprisingly, "Internet of Things" (IoT) and Big Data present new challenges AND opportunities for enterprise IT. Teams need to harness, secure and... All Data Center White Papers | Webcasts