Government open data proves a treasure trove for savvy businesses

Hoping to capitalize on free government information, IT leaders are discovering the value -- and vexation -- of converting terabytes of data into new revenue streams.

1 2 3 4 Page 2
Page 2 of 4

Take Zillow, for example. Zillow is an online real estate database that crunches housing data to provide homeowners and real estate professionals with estimated home values, foreclosure rates and the projected cost of renting vs. buying. Founded in 2005 by two former Microsoft executives, the Seattle-based outfit is now valued at more than $1 billion. But success didn't arrive overnight.

"Anyone who has worked with public record data knows that real estate data is among the noisiest you can get. It's a train wreck," says Stan Humphries, chief economist at Zillow, referring to the industry's lack of standard formatting. "It's our job to take that massive hairball and pull relevant facts out of it."

Today, Zillow has a team of 16 Ph.D.-wielding data analysts and engineers who use proprietary advanced analytics tools to synthesize everything from sales listings to census data into easy-to-digest reports. One of the biggest challenges for Zillow's IT department has been creating a system that integrates government data from more than 3,000 counties.

Stan Humphries

"There is no standard format, which is very frustrating," says Humphries. "We've tried and tried to push the government to come up with standard formats, but from [each individual] county's perspective, there's no reason to do it. So it's up to us to figure out 3,000 different ways to ingest data and make sense of it."

Complaints about the lack of standard protocols and formats are common among users of open data.

"Data quality has always been and will continue to be a very important consideration," says Chui. "It's clear that one needs to understand the provenance of the data, the accuracy of the data, how often it's updated, its reliability -- these will continue to be important problems to tackle."

So far, Zillow's plan of attack using "sophisticated big data engineering" is working. In fact, a new set of algorithms has helped improve Zillow's median margin of error from 14% to 8%. And for 60% of all sales that occur, Zillow's estimated sales price is within 10% of the actual figure.

Story continues on next page >

1 2 3 4 Page 2
Page 2 of 4
  
Shop Tech Products at Amazon