Data.gov gets an open-source revamp

The updated Data.gov site relies heavily on open-source software such as Solr, WordPress and CKAN

The U.S. government's portal for the data it creates, Next.Data.gov, is getting a revamp that should make it easier to view and reuse government data.

The update should also help federal agencies comply with a White House executive order issued in May to make government data machine-readable by default.

The beta version of the site, now available for user testing under a subdomain of Next.Data.gov, features more visualization of government data, an expanded section for communities of interest, and a stream of examples of government data usage by third parties.

"It looks different, and it is exciting that they are pulling in more information about how data is used and how people are talking about" government data, said John Wonderlich, policy director for the Sunlight Foundation, a nonprofit organization that seeks to foster greater government openness and transparency through the use of the Internet. "The first look is encouraging."

An early initiative from the Obama administration, Data.gov was launched in 2009 as a way to collect and provide a portal for data sets created by U.S. federal agencies, so they can be viewed and reused by the public.

In much the same way that the Defense Department's GPS (Global Positioning System) data has fueled the growth of geolocation-based businesses, so too should these additional government data sets generate new businesses, President Barack Obama has argued.

The site's popularity has steadily been growing. In May, it received 213,000 visitors, more than twice the number of visitors in May 2012.

Data.gov now offers more than 70,000 data sets, from 174 agencies and agency offices. It also offers almost 300 APIs (application programming interfaces) for agency services.

Data.gov's challenge is to ensure that "as much data as possible ends up there and that agencies take seriously the requirement that they are open with their information," Wonderlich said.

The White House charged its Office of Management and Budget (OMB) to develop the site. The OMB then had the White House's Office of Science and Technology Policy (OSTP) oversee the project. The General Services Administration (GSA) manages the operations and development of Data.gov.

For the update, "The team studied the usage patterns on Data.gov and found that visitors were hungry for examples of how data are used," wrote Nick Sinai, U.S. deputy chief technology officer and Ryan Panchadsaram, senior adviser to the U.S. technology officer, in a co-bylined blog post announcing the update.

The site will include a stream of blog posts, tweets, quotes and other sources showing how people and organizations are using government data feeds. "It certainly helps the Data.gov brand to have people understand how data is being used in the world," Wonderlich said.

Data.gov will also visualize some sets of government data in a rotating display on the home page, using the D3.js JavaScript library. The preview shows a visualization of earthquake data from the U.S. Geological Survey.

For the site, the Data.gov development team will rely heavily on open-source software. The redesigned site will use the Apache Solr search server software to improve the site's search capabilities. Agencies that post their metadata in the Common Core Metadata Schema will have their data sets indexed by Data.gov.

For the data catalogue, it will use the CKAN (Comprehensive Knowledge Archive Network) data management platform. For content and the community sections, Data.gov will use the WordPress content management system.

Even the fonts are open source -- the new site will use the Abel and Lato fonts, from Google Fonts, though some argue that these fonts, available only as a service, aren't truly open source in nature.

Nonetheless, the use of open source is a "reassuring sign" that Data.gov is moving further in line with the White House's preference to maximize the use of open-source software, Wonderlich said.

Data.gov is seeking feedback for the new design, through GitHub, Twitter and Quora.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

FREE Computerworld Insider Guide: IT Certification Study Tips
Editors' Picks
Join the discussion
Be the first to comment on this article. Our Commenting Policies