Solution Architecture

Super Crunching

Require agencies to submit datasets in standard format with common metadata fields, including short and long descriptions to improve user understanding of data-set.

Create a user interface that enables users to easily graph multiple time-series data sets (simple trend graphs). This will let them visually compare different data sets on relative scales.

By aggregating all government data into a standard format and enabling users to select and compare different data-sets, more in depth "super crunching" can be done, like multivariate regression analysis across selected years and data sets.

Other features: 1) allow users to easily export underlying selected data from any individual or series of dataset comparisons, 2) allow users to easily export the graph/tables they creates, ensure a consistent stamp that indicates the source(s) and when the data was extracted.

Data.gov is currently a hodge-podge of data sets and random "apps". I work the EPA and found the TRI data sets horribly unorganized and not very useful. The power users would go to the EPA website, and new users would find it difficult to make sense of them. I think it is imperative to make data.gov a database of standardized data sets instead of just a flea market for whatever agencies submit.

For many data sets, particularly the economic data, this standardization would not be hard to do. The relative burden of each agency is minimal compared to the benefits of being able to access all government data in a structured way from a common database.

Tags

Submitted by
Share this idea:

Voting

23 votes
Idea No. 43