White House set to unleash 100,000 federal data sources via data.gov
The real test of the openness of government database information will be public adoption
Computerworld - WASHINGTON -- The U.S. plans to make more than 100,000 data sources available by the end of next week on its data.gov site, in what may be the real start of government's effort to share its vast database with the world.
Data.gov has been open for business for about two weeks but with fewer than 100 data sources available it's now just a teaser of a site.
Data.gov is cataloging data and presenting it in standard formats, such as CVS or XLS, or Keyhole Markup Language (KML) used in Google Earth and XML, among others. In many cases, agencies will develop widgets and other tools make the data accessible and interesting. A simple example is the FBI's Top Ten Wanted widget.
But the real test will be public adoption. Federal CIO Vivek Kundra said the effort to build out data.gov is "a very high priority" because he believes it has the potential of "unlocking the innovation and tapping into the ingenuity" of the private sector as well as Americans generally. Users will also be able to rank data sets on their utility, usefulness, and ease of access.
Over time, the U.S. will continue to expand the data sets, as well as add tools to help users extract and work with government data.
Kundra's hope is that people will take data from multiple sources and develop new insights. "The intersection of true value is generally around multiple disciplines," he said, in a briefing today with reporters.
Kundra said he doesn't know how many sources are available. The U.S. has more than 10,000 systems, some of which contain rivers of data but getting at it may take investments and more processing power to serve up the information, he said.
As the U.S. upgrades systems, a core requirement will be to ensure the new systems are capable of data sharing. But government transparency will be the "default," he said.
The Sunlight Foundation in Washington is running a contest for developers, with some $20,000 in prize money, to build anything from client applications, iPhone applications, Web-based apps, working with federal data. The contest's first criteria: "Does the app help citizens see things that they see before the app existed?"
Sunlight Labs director Clay Johnson said that most of the government is now doing is consolidating data that is already public but is often difficult to find. He said that creating a catalog is no small thing considering that there may be may be forgotten gold mines of data in government systems.
What may be the test for the government over time is whether it is willing to release data that hasn't been easily available, such as financial disclosure forms for Senate appointed administration officials. "Don't just release the data that convenient for you to release, release the data that should be released," said Johnson.
Obama and tech
- Outgoing federal CIO warns of 'an IT cartel'
- @whitehouse takes on Twitter Town Hall
- Obama's CIO quits
- Little new in Obama cybersecurity proposal
- Feds update IT plan following Obama's 'horrible' comment
- Obama's online trusted ID plan greeted with caution
- U.S. Census tech makeover includes 'oasis' for innovation
- Obama seeks big boost in cybersecurity spending
- QuickPoll: Is Obama's 98% 4G broadband coverage goal realistic?
- Obama goal: 98% of U.S. covered by 4G broadband
Read more about Databases in Computerworld's Databases Topic Center.



- Excel 2010 Cheat Sheet
- Register for this Computerworld Insider Cheat Sheet and gain access to hundreds of premium content articles, guides, product reviews and more.
- HP Advanced Information Services for SAP In-Memory Appliance (SAP HANA)
- Organizations are eager to connect the vast amounts of data available within and outside their businesses to compete more effectively and make better...
- Galliker builds next-generation Cisco data center
- Originally Galliker Transport AG only intended to upgrade its bandwidth to 10 gigabit per second in the core network of the data center...
- Oracle TimesTen In-Memory Database on Oracle Exalogic Elastic Cloud
- This white paper describes configuration considerations, best practices and performance results of TimesTen running on Exalogic.
- Overcome Top 7 Admin Challenges of Active Directory
- As Active Directory's role in the enterprise has drastically increased, so has the need to secure the data. Gain insight on creating repeatable,...
- Insiders Can Ruin Your Company. Take Action.
- Did you know that 80 percent of threats to an organization come from the inside? The threat from insiders is often overlooked in... All Databases White Papers
- Oracle Database Appliance - Simplifying your High Availability Database
- Date: February 29, 2012
Time: 1:00 PM EST
Seasoned IT managers know from experience that in many cases the bulk of the cost of an... - Optimizing Networks for the Cloud
- Join guest speaker, Rohit Mehra, IDC Director of Enterprise Communications Infrastructure, to explore current trends, discuss best practices for optimizing Data Center and...
- Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
- Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
- Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
- Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
- Customer Spotlight: How IPC The Hospitalist Company Implemented Oracle on VMware
- Have you been looking to hear about customer's experiences with the new VMware vCenter Site Recovery Manager product? View this webcast to learn... All Databases Webcasts
