Strategic Intent

Data Quality - Need process for assuring 'good data' on

On Page 9 of the CONOP, the example of Forbes' use of Federal data to develop the list of "America's Safest Cities" brings to light a significant risk associated with providing 'raw data' for public consumption. As you are aware, much of the crime data used for that survey is drawn from the Uniformed Crime Reporting effort of the FBI. As self-reported on the "Crime in the United States" website, "Figures used in this Report are submitted voluntarily by law enforcement agencies throughout the country. Individuals using these tabulations are cautioned against drawing conclusions by making direct comparisons between cities. Comparisons lead to simplistic and/or incomplete analyses that often create misleading perceptions adversely affecting communities and their residents."

Because seeks to make raw data available to a broad set of potential users, how will address the issue of data quality within the feeds provided through Currently, federal agency Annual Performance Reports required under the Government Performance and Results Act (GPRA) of 1993 require some assurance of data accuracy of the data reported; will there be a similar process for federal agency data made accessible through If not, what measures wll be put in-place to ensure that conclusions drawn from the data sources reflect the risks associated with 'raw' data? And, how will we know that the data made available through is accurate and up-to-date?


Submitted by

Stage: Active

Feedback Score

23 votes

Idea Details

Vote Activity (latest 20 votes)

  1. Upvoted
  2. Upvoted
  3. Upvoted
  4. Upvoted
  5. Upvoted
  6. Upvoted
  7. Upvoted
  8. Upvoted
  9. Upvoted
  10. Downvoted
  11. Upvoted
  12. Upvoted
  13. Upvoted
  14. Upvoted
  15. Upvoted
  16. Upvoted
  17. Upvoted
  18. Upvoted
  19. Upvoted
  20. Upvoted
(latest 20 votes)

Similar Ideas [ 4 ]


  1. Comment

    In response to Chuck about data quality - there is, in the meta-data (data about data) that indicates quality level - because to your point, people want to know if it's good data or bad data. I also think there is a set of criteria that speaks to 'quality' so hopefully each dataset is using the same measuring stick. However, if this metadata is not complete then we're back to your point. I'm wondering if it may make sense to make the 'required' if its not already.

  2. Comment
    Chuck Georgo ( Idea Submitter )

    Ellena, I see the potential for at least three types of potential data errors:

    - errors of commission - where a lack of rigor in the collection process may have allowed skew or bias. In the crime example; agencies may have different ways of classifying certain crimes.

    - errors of omission - where the data set is incomplete or doesn't sufficiently represent the metric measured. In the crime example; agencies may not report some (or any) data for some crimes.

    - errors of analysis - where the federal agencies release statistics based on one of the other two types of errors.

    The best some agencies may consider is to include a 'confidence' data element to provide consumers with some idea for how comfortable the federal agencies are with respect to the quality of the data they are sharing.


  3. Comment

    A method of expressing business rules should be available for whatever format which is used to express data. The business rules format should also be a global standard, just like whatever format for the information decides to use.

  4. Comment
    Chuck Georgo ( Idea Submitter )

    NCJA just published this article on crime data the spiritof openness I thought i'd share ;-) ...

    Deficiencies In Old Crime Data-Collecting Methods Still Seen Today

    Problems that existed in crime reports dating back to more than 40 years ago still exist today, according to an article in the Wall Street Journal.

    According to the Journal President Lyndon B. Johnson’s Commission on Law Enforcement and Administration of Justice published a report in 1967 that claimed 52 percent of American men would be arrested in their lifetime. Flaws that were admitted in that report are still present in many of today’s reports.

    To continue reading -->

  5. Comment

    I think the #1 priority in regard to quality is contextualizing the data presented here. That means placing links on the data set's page to the webpages of the originating studies and reporting, placing links to federal pages discussing/introducing the data, and placing links to retrospective pages which analyze the data set's limitations in a comparison of available data sets on the subject.

Add your comment

Your comment will be published after it's approved by the moderators.