White House set to unleash 100,000 federal data sources via data.gov

04.06.2009
WASHINGTON -- The U.S. plans to make more than 100,000 data sources available by the end of next week on its , in what may be the real start of government's effort to share its vast database with the world.

Data.gov has been open for business for about two weeks but with fewer than 100 data sources available it's now just a teaser of a site.

Data.gov is cataloging data and presenting it in standard formats, such as CVS or XLS, or Keyhole Markup Language (KML) used in Google Earth and XML, among others. In many cases, agencies will develop widgets and other tools . A simple example is the .

But the real test will be public adoption. Federal CIO said is "a very high priority" because he believes it has the potential of "unlocking the innovation and tapping into the ingenuity" of the private sector as well as Americans generally. Users will also be able to rank data sets on their utility, usefulness, and ease of access.

Over time, the U.S. will continue to expand the data sets, as well as add tools to help users extract and work with government data.

Kundra's hope is that people will take data from multiple sources and develop new insights. "The intersection of true value is generally around multiple disciplines," he said, in a briefing today with reporters.

Kundra said he doesn't know how many sources are available. The U.S. has more than 10,000 systems, some of which contain rivers of data but getting at it may take investments and more processing power to serve up the information, he said.

As the U.S. upgrades systems, a core requirement will be to ensure the new systems are capable of data sharing. But government transparency will be the "default," he said.

The Sunlight Foundation in Washington is running a , with some $20,000 in prize money, to build anything from client applications, iPhone applications, Web-based apps, working with federal data. The contest's first criteria: "Does the app help citizens see things that they see before the app existed?"

Sunlight Labs director Clay Johnson said that most of the government is now doing is consolidating data that is already public but is often difficult to find. He said that creating a catalog is no small thing considering that there may be may be forgotten gold mines of data in government systems.

What may be the test for the government over time is whether it is willing to release data that hasn't been easily available, such as financial disclosure forms for Senate appointed administration officials. "Don't just release the data that convenient for you to release, release the data that should be released," said Johnson.