|
|
Getting Started:
Add Your Idea by clicking on the “new idea” button to the left.
While you're here:
Review Our Plans Take a look at our ideas to evolve Data.gov. You can
download our draft Data.gov Concept of Operations in Word or
Acrobat formats.
- Comment on Ideas Jump in and let us know what you think about the ideas others have raised.
- Vote, Vote, Vote! Your votes are critical to ensuring that the best ideas "bubble up" to the top.
- Spread the Word! E-mail, tweet, or post this URL (
http://datagov.ideascale.com/) to your networks, and invite them
to get involved.
- Review our
Terms of Participation and Privacy Policy.
Evolving Data.gov with You
We opened this discussion to encourage the community to share creative ideas and help us evolve Data.gov. We are getting some great ideas and discussion. We are working hard to incorporate many of your ideas, so stay tuned as Data.gov transforms throughout 2010.
If you are just visiting the site for the first time, we urge you to take a look at our Plan, the Draft Concept of Operations, read through the ideas already posted to the site, contribute new ideas and comments, and to vote for your favorite ideas. Topic areas, including the Developer’s Corner, are on the left hand tool bar and you can dive right in with the Getting Started section, which appears on the right hand tool bar. Let your voice be heard to make Data.gov a richer, more user-friendly resource for everyone!
Thank you very much for your contributions, they will be invaluable as we take the next steps to build future versions of Data.gov.
Linda A. Travers is CIO Council Data.gov Co-lead and Deputy CIO, US Environmental Protection Agency
Sanjeev "Sonny" Bhagowalia is CIO Council Data.gov Co-Lead and CIO, US Department of the Interior
|
|
|
I just downloaded some energy data from data.gov on nuclear reactors and one of the columns is:
"NRC Unit"
I have no idea what that means? Every column or field of data should have a definition and that should be available on data.gov or in a standard format with the dataset. In this case, the data dictionary field that the catalog record links to does not have the definition of this field.
|
|
|
Data.gov should become a nationally managed "access point" that provides a mechanism for all levels of government to participate or integrate with, thus creating a single location for citizens to access government data.
|
|
|
An API to provide customizable RSS feeds should be considered, to allow users to subscribe to specific thematic areas, geographic areas and so on. This way, if for example someone was interested in data on aquatic resources in the Chesapeake Bay, they could subscribe and apply filters by custom search terms, by geography of interest, thematic keywords and so on.
This could either provide notices via e-mail, or by RSS, et cetera. As an example, Google allows users to do a custom search (e.g. Google News search or Google Blog Search with a user-specified query) and then the search results page displays an RSS feed icon, which users can then load into their RSS feed reader, to get regular updates matching those search parameters.
|
|
|
Most of the big APIs like Twitter, Facebook, and now LinkedIn have a dedicated core team that spends a lot of time evangelizing the API and working with developers to get what they need.
Data.gov needs a Data.gov evangelist who can be the community manager out on the road listening and talking and generally spreading the word on data.gov and its value.
I think this is a separate role from leadership like Vivek Kundra and Aneesh Chopra (who have a ton of priorities so hard to manage bandwidth).
|
|
|
The system should have a way to add "notes" about the data. This could include specific things that should be watched for (for example, the assumptions that the data was collected under, or the methology used to collect it)...it's not much use to use the dataset "tax data for 1999" in a mission critical application if the methology was to "ask 10 people what the tax rate was in 1999"
Likewise, there should be ways for the public to add notes about possible problems they have found with the data and a feedback mechanism to ensure that others known about these possible problems...for example "21 Main St." no longer exists and should be removed from the "current address" dataset.
|
|
|
As the Dublin Core proved - a little bit of standardization can go a long way. In the open gov data arena, a short list of the areas it would make sense to have global standards on would help all governments and citizens get clear on what needs to be developed for voluntary adoption and where we develop the standards. This idea is to openly develop that short list so we can accelerate this transformation.
|
|
|
We, at DataGov are very interested in enabling datasets with API interfaces that will allow applications to dynamically use the data without downloading the entire dataset. What are some of the most popular ways that developers recommend and can you provide examples of use cases and/or existing apps that support this need?
|
|
|
Federal agencies are already trying their best to respond to a stream of unfunded mandates. Requiring federal agencies to a) expose their raw data as a service and b) collect, analyze, and respond to public comments requires resources. The requirement to make data accessible to (through) Data.gov should be formally established as a component of one of the Federal strategic planning and performance management frameworks (GPRA, OMB PART, PMA) and each agency should be funded (resourced) to help ensure agency committment towards the Data.gov effort. Without direct linkage to a planning framework and allocation of dedicated resources, success of Data.gov will vary considerably across the federal government.
|
|
|
The current Data.gov search tool is capable but cumbersome and does not yield desired results. We should combine and leverage capabilities in other government websites that have good search engines (and/or are planning on more improvements such as usa.gov).
We should leverage and make this search available simply on the top of the site for all users. we should also have the ability to invoke more complex search capabilities as required.
|
|
|
In browsing the current Data.gov holdings, one sees data which has in some instances been chunked (for example large datasets which would contain too many records as CSV) - perhaps these may be broken out by state, et cetera. Similarly, there are cases where there are datasets which are part of a time series (e.g. 2005->2006->2007 annual data releases) - it would be useful toward usability to have ways to relate these and treat them as groups, e.g. treating an entire collection of individual state files as a unit, or being able to navigate (for example, if I am looking at the 2005 Tennessee dataset, I might want to be able to quickly jump to the 2006 Tennessee dataset).
|
|
|
Perhaps it's there and I don't see it, but it would seem that some kind of automated versioning system needs to be in place. My concern, from a transparency standpoint, would be if there is the capability to overwrite an existing dataset. Even if it is well intentioned, if the original dataset is not preserved, questions could arise as to what has been changed/added/removed.
This could also help with datasets that change over time. Rather than having numerous "Year 200x" versions, an automated version system could handle this much cleaner.
|
|
|
Comparing the list of agency participation on data.gov, it appears that the number of raw data sets and tools posted by certain agencies has gone down at certain times. If an agency removes access to raw data or a tool on data.gov, it should be noted on the site, and a reason for the removal should be given.
|
|
|
Many high-profile projects that rely on a community of supporters, like mozilla.org and open office, post a public roadmap that details major forthcoming plans and milestones. OMB/GSA should do the same for data.gov so that the public can see when planned innovations will occur and that they are actually influencing the evolution.
|
|
|
Create a page that highlights applications that are making use of data from data.gov. The examples should be real world (not demo's) sites that currently are making use of the data. They should range from simple to more complex. The public should be able to submit their sites to list. Possibly to encourage the use of the data prizes should be given out for the "best" sites, chosen by the goverment, or by public voting.
Contests to make use of the data would give insentives for the data to be used, and possibly create more value, and show the value of the system (it may also encourage larger buy-in from the different branches of government) and encourage local goverments to provide more data.
|
|
|
Many professional and social circles continue to focus on issues such as climate change, governmental fiscal imbalances, the demographic shift to older populations, depleting resources, health care, and increasing technological complexity. We are about to embark on a new decade and these unresolved issues will follow us. How will Data.gov contribute toward addressing the tough questions we face in the next decade? One way to envision the contributions of Data.gov is to build a series of scenerios, maybe one for each major issue that if left unresolved could lead to a major disruption.
|
|
|
Though there is another idea regarding a wikipedia entry... I think it is important to have a separate wiki set up for developers as there are many developer topics where collaboration is key. Of course, there are tons of precedent for this whereby many major development efforts/sites have developer wikis.
|
|
|
In addition to posting datasets and web services, Agencies should also be posting code and documentation. As with a data policy of "By default, all data should be made available, unless there are compelling reasons why not, e.g. sensitivity" so to should be the case with GOTS code. A large number of parallel development efforts by contractors and agency staff alike are underway across many agencies, often replicating similar efforts. To realize economies of scale and foster collaboration, there should be a centralized Federal initiative to promote sharing and reuse of code and documentation across agencies, similar to a SourceForge approach. Agencies could maintain such code repositories, with a common API that allows a single point of entry to search across all agencies.
The intent would be to facilitate code sharing where developed using government funding, e.g. GOTS code developed under contract to an agency, code developed via grants and so on. Private sector individuals making their own investments in innovative code would not need to participate but could do so on a voluntary, opt-in basis - proprietary code and COTS software code that was not developed through federal funding would not need to be posted.
|
|
|
You should list the types of data requested by the public and the number of votes for each type.
|
|
|
I feel like this is an obvious one but I just tried downloading some data set that has .dbf, .shp, and .shx files. I don't recognize any of these and could not open any on my computer.
Lets make this useful but allowing the data provided to be accessed.
|
|
|
While there are quite a few discussions on this site about search; I would like to see improved browsing via a robust taxonomy/folksonomy of topic areas. In fact, I would recommend both a top-down/bottom-up approach where you begin with a top-down taxonomy but allow it to be extended via topic area suggestions and popular keywords.
As a citizen, I don't always know what I want but want to browse and see what is available.
|
|
|
I applaud the visionaries behind Data.gov for understanding the importance of feedback. It is great that there are feedback input mechanisms for datasets and also this feedback input site (ideascale) for feedback on the solution governance, architecture, and technologies.
But in an ideal world feedback is an ongoing conversation, with comments or suggestions going into the system, responses and requests for use cases or clarification coming back out, explanation going back in, etc. This ongoing conversation seems to be broken in Data.gov. From the outside we can make suggestions, we can vote on suggestions, we can comment on each other's suggestions. But it seems that we can never hear back from Data.gov. It is impossible to tell if our suggestions are heard and/or understood, much less if there is likely to be any action in response.
I understand that there may be roadblocks to conversations, but it is important that these roadblocks be addressed because otherwise I think the community will tire of sending comments into the black hole, and will decide that feedback to Data.gov is ineffective, and that there are no opportunities to help make Data.gov better. This would be a sad loss of potential.
|
|
|
In addition to providing Agency metadata and facilitating informed discovery of datasets, there may be a good role for Data.Gov to serve in also serving as a clearinghouse for discovering and facilitating access to datacentric agency web services, e.g. Open Geospatial Consortium Web Map Service (WMS) and Web Feature Service (WFS). In many instances, these types of services may lie scattered across and within agencies, with users accessing them via differing URLs, often with their own unique parameters, supported functionalities, versioning and quirks. This can be tremendously simplified by using Data.gov to aggregate these disparate services - an example of this can be found here: http://carboncloud.blogspot.com/2010/01/cubewerx-and-carbon-project-contribute.html - where CubeWerx and the Carbon Project developed a cascading OGC service to facilitate and simplify access to a variety of data services to support Haiti response and recovery.
Essentially, what this provides is a single point of entry, which provides a catalog of services collected dynamically from disparate OGC WMS and WFS servers, which then can all be accessed seamlessly from one point, as opposed to having to add multiple servers separately.
|
|
|
On Page 9 of the CONOP, the example of Forbes' use of Federal data to develop the list of "America's Safest Cities" brings to light a significant risk associated with providing 'raw data' for public consumption. As you are aware, much of the crime data used for that survey is drawn from the Uniformed Crime Reporting effort of the FBI. As self-reported on the "Crime in the United States" website, "Figures used in this Report are submitted voluntarily by law enforcement agencies throughout the country. Individuals using these tabulations are cautioned against drawing conclusions by making direct comparisons between cities. Comparisons lead to simplistic and/or incomplete analyses that often create misleading perceptions adversely affecting communities and their residents."
Because Data.gov seeks to make raw data available to a broad set of potential users, how will Data.gov address the issue of data quality within the feeds provided through Data.gov? Currently, federal agency Annual Performance Reports required under the Government Performance and Results Act (GPRA) of 1993 require some assurance of data accuracy of the data reported; will there be a similar process for federal agency data made accessible through Data.gov? If not, what measures wll be put in-place to ensure that conclusions drawn from the Data.gov data sources reflect the risks associated with 'raw' data? And, how will we know that the data made available through Data.gov is accurate and up-to-date?
|
|
|
Geographic referencing adds critical context to data. It helps users quickly and easily determine whether a dataset pertains to their specific area of interest, and in the event that it does, empowers users by immediately allowing them to visualize that data, perhaps coupled with addtional datasets for informing context. Both Geospatial One Stop and Data.gov are citizen centric initiatives. Migrating and consolidating the two programs would both energize and maximize any place based analyical capability the nation could leverage in the future.
|
|
|
Require agencies to submit datasets in standard format with common metadata fields, including short and long descriptions to improve user understanding of data-set.
Create a user interface that enables users to easily graph multiple time-series data sets (simple trend graphs). This will let them visually compare different data sets on relative scales.
By aggregating all government data into a standard format and enabling users to select and compare different data-sets, more in depth "super crunching" can be done, like multivariate regression analysis across selected years and data sets.
Other features: 1) allow users to easily export underlying selected data from any individual or series of dataset comparisons, 2) allow users to easily export the graph/tables they creates, ensure a consistent stamp that indicates the source(s) and when the data was extracted.
Data.gov is currently a hodge-podge of data sets and random "apps". I work the EPA and found the TRI data sets horribly unorganized and not very useful. The power users would go to the EPA website, and new users would find it difficult to make sense of them. I think it is imperative to make data.gov a database of standardized data sets instead of just a flea market for whatever agencies submit.
For many data sets, particularly the economic data, this standardization would not be hard to do. The relative burden of each agency is minimal compared to the benefits of being able to access all government data in a structured way from a common database.
|
| Displaying 1 - 25 of 162 Ideas |
|
|