When President Barack Obama signed the Open Data Executive Order last May, many IT leaders applauded the White House's decision to release treasure troves of public data as part of an important government initiative for greater transparency.
However, what many didn't bargain for was the state in which they'd find these once-buried data sets. "A dog's breakfast," "a train wreck," "a massive hairball" -- those are a few of the terms IT leaders have used to describe the vast volumes of public data now being made available to the general public.
Yet the business opportunities are unprecedented -- open data offers bits and bytes of public information that are freely available for anyone to use to build new businesses, generate revenue, develop new products, conduct research or empower consumers. With the federal government as the single largest source of open data in the U.S., we now have unfettered access to information about everything from bus routes and pollution levels to SEC filings and weather patterns. Savvy businesses are using public data to predict consumer behavior, shape marketing campaigns, develop life-saving medications, evaluate home properties, even rate retirement plans.
In fact, recent research from the McKinsey Global Institute predicts that the use of open data could lead to the creation of $3 trillion or more in annual value across the global economy. "The vast potential of open data is not only to increase transparency and accountability of institutions, but also to help create real economic value both for companies and individuals," says Michael Chui, a partner at the McKinsey Global Institute, the business and economic research arm of McKinsey & Co.
Hoping to capitalize on this open data revolution, IT leaders are taking the lead in discovering the value of converting terabytes of data into new revenue streams. Forget about the open-source movement's clarion call for free software, greater collaboration and anti-establishment bootstrapping. Today's open data trend is driven by a desire for both greater government transparency and a fatter bottom line. And as more and more techies clamor for a seat at the table, they're finding that the era of open data represents a prime opportunity to prove that they're indispensable revenue-generators, not just server-room sages.
"We're at a tipping point," says Joel Gurin, senior adviser at New York University's Governance Lab (GovLab) and author of Open Data Now: The Secret to Hot Startups, Smart Investing, Savvy Marketing, and Fast Innovation. "This is the year open data goes from being a specialized expertise to becoming part of a CIO's tool kit. It's a very exciting time."
But unlocking open data's value remains a challenge. For one thing, much of today's open data flows from a whopping 10,000 federal information systems, many of which are based on outdated technologies. And because open data can be messy and riddled with inaccuracies, IT professionals struggle to achieve the data quality and accuracy levels required for making important business decisions. Then there are the data integration headaches and lack of in-house expertise that can easily hinder the transformation of open data into actionable business intelligence.
Yet for those IT leaders who manage to convert decades-old county records, public housing specs and precipitation patterns into a viable business plan, "the sky's the limit," says Gurin.
Gurin should know. As part of his work at the GovLab, he's compiling a who's who list of U.S. companies that are using government data to generate new business. Known as the Open Data 500, the list features a wide array of businesses -- from scrappy startups like Calcbench, which is turning SEC filings into financial insights for investors, to proven successes like The Climate Corporation. Recently purchased by Monsanto for about $1 billion, The Climate Corporation uses government weather data to revamp agricultural production practices.
Although their business models vary wildly, there's one thing the Open Data 500 companies have in common: IT departments that have figured out how to collect, cleanse, integrate and package reams of messy data for public consumption.