As the Dublin Core proved - a little bit of standardization can go a long way. In the open gov data arena, a short list of the areas it would make sense to have global standards on would help all governments and citizens get clear on what needs to be developed for voluntary adoption and where we develop the standards. This idea is to openly develop that short list so we can accelerate this transformation.
Comparing the list of agency participation on data.gov, it appears that the number of raw data sets and tools posted by certain agencies has gone down at certain times. If an agency removes access to raw data or a tool on data.gov, it should be noted on the site, and a reason for the removal should be given.
In browsing the current Data.gov holdings, one sees data which has in some instances been chunked (for example large datasets which would contain too many records as CSV) - perhaps these may be broken out by state, et cetera. Similarly, there are cases where there are datasets which are part of a time series (e.g. 2005->2006->2007 annual data releases) - it would be useful toward usability to have ways to relate these ...more »
I applaud the visionaries behind Data.gov for understanding the importance of feedback. It is great that there are feedback input mechanisms for datasets and also this feedback input site (ideascale) for feedback on the solution governance, architecture, and technologies. But in an ideal world feedback is an ongoing conversation, with comments or suggestions going into the system, responses and requests for use cases ...more »
In light of the new Open Government Directive that was released by the White House on Dec. 8th 2009 can the Paperwork Reduction Act be updated (or at least and emergency exception be made) to allow government agencies to request feedback from the public to make better decisions as to what data sets these government agencies should make available.
A citizen should be able to request a description of how a data element is derived. For example, the very controversial - "saved or created jobs" data element should have an explanation of the formula or the derived fields used to calculate the number. Another example, would be the all important unemployment data - how is that percentage calculated? This would give the public more confidence in using the data. In ...more »
Toward more effective means of data discovery, particularly as more and more datasets are published and registered in data.gov, one thing that might provide value, particularly to, for example, the science community, might be integration and use of existing taxonomies, ontologies and thesauri as developed by various agencies. Models for integration exist, such as CUAHSI in the water community, or GBIF for biodiversity. ...more »
It is useful to show the filesize as many agencies do. This helps some users decide (a) whether the size is too small to contain the data envisioned in the description (often the case) or (2) to prepare for an especially large load of data.
Many of the goals of data.gov will only be realized when many govt data shops serve web services using common standards so they can be accessed by other web services and synthesizing portals. Like a common 'what data exists in your database in this ___ geographic area [lat long box, county, zipcode, LL/zip centroid plus distance]?' web service standard. Work with teams in the major gov data centers to develop standards ...more »
Hi, one of the greatest problems I find right now is data quality. First, there are no way to validate the data. Second some data (for example dataset 401 about budgets) contains negative values (check Dept. of Defense). This not only is not useful, but makes people and developers distrust data.gov as a source of information and valid data.
What would the community feel is the top 5 metadata elements that should be mandatory for a dataset catalog record AND of those 5 which do you think is most important to developers and citizens wanting to discover and use datasets that are relevant to them? Here is a recommendation: 1. Subject Coverage (aka topic or "WHAT", for a hierarchical tree display/browsing) 2. Schema Location (for structured datasets) 3. Geospatial ...more »
We need the disclosure of the names of nursing homes and rehabilitation centers that have either low ratings, high incidences of abuse, neglect or high rates of death. We, the public, are unaware of the ratings of these institutions when we place our loved ones there. They are recommended to us by hospitals upon discharge of elderly relatives, so we naturally believe they are safe. In 1999, my 76 year old mother died ...more »
The open government directive has specific requirements like the following: Within 45 days, each agency shall identify and publish online in an open format at least three high-value data sets ... and register those data sets via data.gov... These must be data sets not previously available online or in a downloadable format. Tracking those specific requirements on data.gov (separate from the open government dashboard ...more »
On the main data.gov website, there is the Agency Participation tab. It does what you expect. It shows agencies and related statistics. What it doesn't do is simplify search. If you present this information, the statistics entice you to click on the participant to see what and why. If you can generate the numbers, you can populate a hyperlinked search to the datasets, tools, and more. Having to go back and then filter ...more »
Lets say you publish the number of cases cleared each month in 2009 but later learn that the numbers for some months were understated due to a server problem. If someone has already downloaded and is using the defective dataset, how are they notified that they need to download the dataset once more in order to have accurate data?
It would be great if users could see all the Data Sets that have been suggested. This could cut down on duplicate data set suggestions.
The federal government is an information-intensive organization and it is imperative that data within and across the federal government be well managed. A central repository of data about data—also referred to as meta data—can be an effective data-management tool. However, a repository, as proposed here, is not to be confused with a data dictionary that merely gives definitions of data. A repository can be used to manage ...more »
Many developers would like to mine the data at a large scale which may include a majority or all of the available data. It is technically possible to make a web scraper or mirror the site but this would be a long, slow process that would consume a great deal of bandwidth both for the client and for the servers housing the data. What would be an optimal way to pull down all the data would be to distribute it via Bit Torrent. ...more »
Government institutions may try to briefly surface their data on Data.gov to earn points on the "open data dashboard", only to later take them down. I believe this is disingenuous, and clearly goes against the spirit of the Open Government Directive. The case I'm specifically referring to is the data tracking Broadband Stimulus funding via the BIP/BTOP programs. The data available at the following link has not been ...more »
I suppose this isn't a terribly common problem, but not all of us grew up with this technology, or have been able to afford or access it too long between the place of residence not having acess to the web or the costs of equipment being too high for limited budgets. I can't speak for all, but sometimes the maze can be overwhelmingly confusing, trying to find what it is you're looking for, and then understand what it ...more »
Just need a lot of basic features for ease of finding info, such as filter, sort, search, for many data sources, including within/without sites and pages etc...
I belive you need to have a page that shows users how to use the site and how to perform searches. I consider myself pretty tech savvy but I could not easily determine what to do to get the data.
My colleague and I have made examples of how we would like to see and interact with tabular data online. Check out our initial provisional versions of searchable catalogs of data from the Bureau of Economic Analysis (BEA) and the Bureau of Justice Statistics (BJS): BEA: www.entabular.com/bea BJS: www.entabular.com/bjs The sites have a fair amount of data, high-quality metadata, and a pretty good search feature, ...more »
Looking for a site where federal agencies can post/look for new/existing/similar projects and/or federal partners to work with on new efforts.