Solution Architecture

For this topic area post your best ideas on Data.gov and its capabilities, including data management, dissemination, search, semantic web, evolving core modules (shared services), data infrastructure and visualization tools and more.
(@afuller)

Solution Architecture

Make RSS Feeds Available

This entry is a consensus recommendation of seven organizations that work on government transparency of which OpenTheGovernment.org is one.

 

Consider making some of the datasets available as feeds that are constantly up to date, rather than as static datasets that are pulled down and then reposted on an occasional basis.

Voting

11 votes
Active
(@michael.daconta)

Solution Architecture

Don't Inflate Counts! (aka let a Dataset contain multiple files)

When searching for interesting datasets it can be very frustrating to see identical datasets (except by year) cluttering the search results. To me, this seems like just an attempt to inflate the counts by treating separate years or geography as separate datasets. Here is my proposed simple rule: If the dataset metadata fields (i.e. like "Coverage Date") are the same, it is the SAME dataset. Just because the instance ...more »

Voting

10 votes
Active
(@chuckgeorgo)

Solution Architecture

Data.gov Dashboard - Use FEA BRM/SRM to model dashboard measures

I would like to recommend that the Data.gov dashboard be developed (aligned) with the Federal Enterprise Architecture (FEA) Business Reference Model (BRM) and Service Component Reference Model (SRM). The BRM provides a view of the federal government’s core lines of business (the products and services it delivers to its citizens, the private sector, and other government agencies), and the SRM classifies the internal ...more »

Voting

9 votes
Active
(@ken000)

Solution Architecture

Have Datasets Divided by Geography Appear as one set in search.

Data.gov would be more usable if there was not so many entries on the catalog search page taken up by one dataset which is divided up geographically. A single row in the search results which indicated the geographical divisions of the data and allowed the user to drill down to the geographic specific files would make data.gov more usable. Similarly, different versions of a dataset release on different dates (such those ...more »

Voting

8 votes
Active
(@davidsmith)

Solution Architecture

Data Enhancement/Manipulation Capabilities

Provide basic capability toward data enhancement/manipulation/packaging - Here, the idea would be to provide reusable infrastructure that can be used by data stewards for data enhancement, and potentially conversion and packaging. Sample use case: Data steward uploads an Excel spreadsheet containing his data, which also includes addresses and - and then uses tools provisioned by Data.gov to geocode the dataset to ...more »

Voting

8 votes
Active
(@charleshoffman)

Solution Architecture

Data sets should be extensible like XBRL or RDF/OWL

Data sets should be extensible or flexible, similar to the characteristics of XBRL or RDF/OWL, rather than fixed schemas. This flexibility allows the data sets to evolve, it allows others to connect additional information to existing information. The notion of "linked data" as used by the Semantic Web people. Both XBRL and RDF/OWL are modeled as graphs which are extremely flexible. Combine these graphs with the ...more »

Voting

7 votes
Active
(@afuller)

Solution Architecture

Make web-based interfaces available for public use

This entry is a consensus recommendation of seven organizations that work on government transparency of which OpenTheGovernment.org is one. Some of the currently posted files are quite large, ranging upward to several hundred megabytes. Their large size undermines their usefulness for most people or organizations. The large number of currently posted datasets also makes it difficult to find a particular database of ...more »

Voting

7 votes
Active
(@michael.daconta)

Solution Architecture

Remove and Guard Against (Validate) "Junk" Records

I tried to download a dataset that interested me called the "Occupational Outlook Handbook". The data.gov "record" says that it is a CSV dataset. However, when you click the csv link you go to a website that does not allow you to download the dataset! The data.gov link is: http://www.data.gov/details/336 This is a failure of the simplest validation possible - a link that is supposed to be to a dataset must be tested ...more »

Voting

7 votes
Active
(@davidsmith)

Solution Architecture

Link / Status Checker

Via a blog post, http://www.spatiallyadjusted.com/2010/02/07/data-gov-is-already-broken-just-like-everything-before-it/ - it would be good to integrate periodic link/service checking, ala Geospatial One Stop's Service Status Checker - http://registry.fgdc.gov/statuschecker/index.php. Any status changes/outages should be reported via notification, e.g. RSS feed / email, and directed to the stewards and registrars managing ...more »

Voting

6 votes
Active
(@jimrolfes)

Solution Architecture

Include the capability tie individual data sets w/ super sets

Include the capability to align individual data sets with super sets.

 

Multiple agencies will have need for capability that enables the aggregation of data sets identified within Data.Gov. Data.Gov can be an excellent tool for linking data from separate owners (or in some cases a single owner) that conform to a consistent standard that can be used individually or combined to form a broader data set.

Voting

5 votes
Active
(@afuller)

Solution Architecture

Balance the format of data for developers and the public

This entry is a consensus recommendation of seven organizations that work on government transparency of which OpenTheGovernment.org is one. The format of the data plays a key role in its usability; many within the community of advocates who re-use and repackage government data would prefer data in CSV format, rather than the XML format in which many of the posted databases are provided. Accordingly, we recommend that ...more »

Voting

5 votes
Active
(@adriandwalker)

Solution Architecture

Social Media support for Executable English Q/A

Data by itself is necessary, but not enough, for practical applications. What's needed is knowledge about how to use the data to answer questions -- such as, "how much could the US save through energy independence?" There's emerging technology that leverages social media for the huge task of acquiring and curating the necessary knowledge -- in the form of executable English. One can Google "executable English" to find ...more »

Voting

4 votes
Active
(@louissweeny)

Solution Architecture

A micro-format tag set to describe data resources on the web

We need as many on-ramps for Agencies to get data assets represented on Data.gov as possible (see my On Ramps Idea). How about a micro-format (aka small light set of tags) that could help humans, search engines and other tools identify and aggregate data resources...this could be as simple as an adapted Dublin core set. This markup could be carried on the human documentation page, or on a separate page ala sitemap.xml. ...more »

Voting

4 votes
Active
(@davidwanda)

Solution Architecture

SPEED UP ACCESS BY REDESIGN

DOWNLOADING Data.gov takes forever because of the graphics in the heading. Remove the graphic or redesign so as to speed up downloading of the web site. Sorry I have a dinosaur of a computer, but it's all I have or can afford.

Voting

4 votes
Active
(@ekansa)

Solution Architecture

Faceted Search with Atom Based Web Services

Data.gov should have a faceted search interface that provides a comprehensive overview of Data.gov content (metadata and facet counts), and a way for users to progressively refine their search criteria. Faceted search provides a good way for outsiders to better understand a big collection described by complex metadata structure. The faceted metadata should also be shared as an Atom Feed, so that updates of new content ...more »

Voting

4 votes
Active