Memberships [ 1 ] [+]
Activity Stream [+]
Ideas Contributed [ 14 ] [+]
In examining an interesting data set that is supplied as a Comma-Separated Values (CSV) file, I noticed that the CSV file does not have the header line. It is customary for a CSV file to have the first line be the column names separated by commas. This dataset does not have the header line: http://www.data.gov/details/1617 So, what is the solution here? I recommend that for each format supported on data.gov there exists ...more »
I tried to download a dataset that interested me called the "Occupational Outlook Handbook". The data.gov "record" says that it is a CSV dataset. However, when you click the csv link you go to a website that does not allow you to download the dataset! The data.gov link is: http://www.data.gov/details/336 This is a failure of the simplest validation possible - a link that is supposed to be to a dataset must be tested ...more »
When searching for interesting datasets it can be very frustrating to see identical datasets (except by year) cluttering the search results. To me, this seems like just an attempt to inflate the counts by treating separate years or geography as separate datasets. Here is my proposed simple rule: If the dataset metadata fields (i.e. like "Coverage Date") are the same, it is the SAME dataset. Just because the instance ...more »
A citizen should be able to request a description of how a data element is derived. For example, the very controversial - "saved or created jobs" data element should have an explanation of the formula or the derived fields used to calculate the number. Another example, would be the all important unemployment data - how is that percentage calculated? This would give the public more confidence in using the data. In ...more »
In light of the excellent work other governments are doing in this area, the US should spearhead an international group (maybe in conjunction with the UN) to share best practices, cross-fertilize experience and maybe even share common code or data formats.
While there are quite a few discussions on this site about search; I would like to see improved browsing via a robust taxonomy/folksonomy of topic areas. In fact, I would recommend both a top-down/bottom-up approach where you begin with a top-down taxonomy but allow it to be extended via topic area suggestions and popular keywords. As a citizen, I don't always know what I want but want to browse and see what is available. ...more »
Many high-profile projects that rely on a community of supporters, like mozilla.org and open office, post a public roadmap that details major forthcoming plans and milestones. OMB/GSA should do the same for data.gov so that the public can see when planned innovations will occur and that they are actually influencing the evolution.
Though there is another idea regarding a wikipedia entry... I think it is important to have a separate wiki set up for developers as there are many developer topics where collaboration is key. Of course, there are tons of precedent for this whereby many major development efforts/sites have developer wikis.
What would the community feel is the top 5 metadata elements that should be mandatory for a dataset catalog record AND of those 5 which do you think is most important to developers and citizens wanting to discover and use datasets that are relevant to them? Here is a recommendation: 1. Subject Coverage (aka topic or "WHAT", for a hierarchical tree display/browsing) 2. Schema Location (for structured datasets) 3. Geospatial ...more »
The agency posting the data should assert that it has performed certain quality gates/checks on the datasets it is publishing. I just downloaded one dataset provided as a spreadsheet and one of the columns is: "Contructor" Either that is a weird term or it is an error and meant to say "Contractor". Looking at the data (which is a bunch of acronyms) it is probably the latter. Did whoever published the dataset check ...more »
I just downloaded some energy data from data.gov on nuclear reactors and one of the columns is:
I have no idea what that means? Every column or field of data should have a definition and that should be available on data.gov or in a standard format with the dataset. In this case, the data dictionary field that the catalog record links to does not have the definition of this field.
The open government directive has specific requirements like the following: Within 45 days, each agency shall identify and publish online in an open format at least three high-value data sets ... and register those data sets via data.gov... These must be data sets not previously available online or in a downloadable format. Tracking those specific requirements on data.gov (separate from the open government dashboard ...more »
Like many commercial product catalogs (i.e. Amazon.com) there is a web services api to search/access the catalog. Of course there are also REST APIs to do this (don't want a REST versus web services flame war here).
What catalog API would developers suggest for data.gov?
Is there a standard catalog API suitable for data.gov?
Discuss via your comments...
As discussed in the CONOPS, semantic.data.gov will be an adjunct, experimental site to assist in the evolution of data.gov towards greater semantics. This site will also learn from lessons learned via data.gov.uk which is also exploiting semantic technologies. What would be the most important initial use cases the semantic web community would like to see semantic.data.gov tackle? Ontology development? Rule based alerts? ...more »