Rank7

Idea#24

This idea is active.
Developer's Corner »

Application Programming Interfaces

We, at DataGov are very interested in enabling datasets with API interfaces that will allow applications to dynamically use the data without downloading the entire dataset. What are some of the most popular ways that developers recommend and can you provide examples of use cases and/or existing apps that support this need?

Comment

Submitted by Community Member 4 years ago

Vote Activity Show

(latest 20 votes)

Comments (8)

  1. Mashup-oriented approaches with data services is an excellent way to provide timely data to a wide variety of stakeholders, where for example, state agencies could dynamically integrate their data with Federal data.

    The alternative, of only providing downloadable data, leaves elements up to chance, as the Federal stewards then have little knowledge or idea of how the data is being reused, whether it is being kept up-to-date and so on.

    There are a number of existing standards, APIs and integration approaches which can be offered in the Federal sector - for example in the geospatial community, Open Geospatial Consortium (OGC) standards such as Web Map Service and Web Feature Service provide rendered map images, vector geometry and geographic feature query results; KML can either be provided as a download or dynamic network link (where only the area of interest is being dynamically served), with additional optimization approaches of either serving vector geometry (points and polygons) or image overlays (for example, if someone wanted to view the National Hydrology Dataset on Google Earth, it would take an incredibly long time to stream all of the vectors in full fidelity, whereas one could instead either return a rendered image (e.g. WMS tile embedded in KML) or generalized vector geometry (where streams of smaller stream order were omitted).

    Additionally, there are static data services that can provide robust benefit to a wide variety of users and agencies, such as pre-rendered, cached map tiles suitable for use in web mapping tools like Bing Maps, Google Maps, OpenLayers and so on.

    The science community also utilizes services such as OpenDAP for serving data as NetCDF, HDF and others, these could similarly be registered in Data.gov, to facilitate collaboration, modeling and other work across the science community.

    Also, emergent use of REST APIs, where consistent URI schemes point to data assets at their constituent degrees of granularity, or JSON, which provides an extremely web-friendly way of delivering data to applications.

    There are quite a few ways in which data can be delivered and integrated via such APIs and approaches.

    4 years ago
  2. To build further on your suggestion and some of the other suggestions regarding Wikis, et cetera, it might be good to also work toward a framework for sharing source code, SDKs, code samples, documentation and other pieces - I have posted a few other comments and suggestions along this theme...

    4 years ago
  3. We recently provided a suite of web services (REST and SOAP) for the data managed within our web application/database for fish and wildlife restoration/mitigation efforts in the Pacific Northwest under the Dept. of Energy (more specifically under Bonneville Power Administration). These web services support optional parameters that allow the requestor to narrow their request.

    For example instead of requesting the entire list of 1300 projects in the Columbia River Basin, the requestor can filter the list by Province or Subbasin, or by project stage or purpose, etc.

    Our web services and documentation and fledgling data dictionary are available at:

    http://www.cbfish.org/Report.mvc/Index

    (scroll to bottom of list)

    Note that we require requestors to get and then pass a token when making web service calls. Tokens are free - you just need an account in the system which is also free - they allow us to know who's using the services and enable us to notify consumers of API changes and when new web services are available.

    4 years ago
  4. Great discussion emerging here! The City of New York recently released over 5 Million property records from three different agencies (see www.nyc.gov/data)... Original formats were XLS, MDB, and TXT. We made the data available via a set of REST style search APIs that can be accessed by most programming languages such as JavaScript, PHP, Ruby, .NET, etc. The data is returned as XML, JSON or CSV.

    We did this using the BlankSlate (see www.blankslate.com) platform which takes data as an input, either thru manual upload or programatically, and instantly returns APIs. We also built an embeddable widget that uses the APIs too.

    See www.blocksandlots.com for details.

    Thanks!

    4 years ago
  5. kael - Nice work on the site! Looks like you have about 100+ data sets up there. Glossary is helpful; curious if you'll be asked to provide more of a full dictionary for some of the structured data in your data sets. For example, your Nature Preserves data set has a "HabitatType" attribute that caught my eye (we have the same attribute and are working with orgs like the NOAA, Nature Conservancy and IUCN to square up their lists of Habitat Types with ours).

    We're grappling, as I suspect you are, with how much to invest in full data dictionaries, FGDC compliant metadata documentation, and other ways (e.g., playing around with mind map diagrams) to help users understand the data in our data sets. Our main question is how much to do proactively (anticipating the need) vs waiting until we receive a handful of requests. With limited resources, it's a tough decision, but we're leaning on waiting until there's a clear need.

    Perhaps this is a question for a different topic... if the moderators want to move this to a different thread, please feel free.

    4 years ago
  6. Matt - Ah, I think I see what data set you are looking at and can respond though I need to clarify one thing first.

    The APIs we developed were for the property records that NYC released (Buildings, Finance, City Planning). I think you are looking at a data set from the NYC Department of Parks and Recreation. But no matter, because the point you are making is a good one: how do you help users effectively use the APIs?

    Before addressing the need for a data dictionary (which we certainly will need), we believe that the first step in making data useful is making the data sets accessible. On that front we're taking the following next steps.

    Provide a developer portal (that includes a blog, wiki, and forum) that helps manage --

    ** administrative things like API keys

    ** User access like API call limits, and user support

    ** API usage like how to structure requests and how to read responses and sample code in different languages to shorten time to implementation; and

    ** understanding of the data that is available.

    Once we've addressed data access, the next step is making this data meaningful. To that extent, consistent documentation in the form of a data dictionary is a great approach. We plan on auto generating base documentation by listing data attributes (field names, data type, content length, etc) that can be machine interpreted and provide the facility to manually refine the definition through a community or controlled approach such as in a wiki or cms.

    We can post a link to the portal on this site.

    4 years ago
  7. Kael - Thanks, that makes sense for the most part. For us its still a bit of a tough call on whether to do the things like you mention (creating developer community sites/blogs/wikis, writing docs/help topics for developers, providing sample code) vs. providing more end-user (for us, policy/mgmt level and perhaps BI types) documentation on the actual data to help them decide if it's worth their time.

    Perhaps a chicken-or-egg question, though I'd think you'd want to start with creating docs/support for the users you expect to want to use your data, the ones with some need (aka business driver). For us, I feel like its the policy and management folks that effectively run and make funding decisions on large programs. They'll no doubt then go down the hall and ask their developer/IT folks to access and make use of the data.

    Tangentially: as far as furnishing sample code that exercises your web services, we recently had the luck to stumble across the http://www.usgovxml.com website (not officially associated with data.gov). The guy behind it is Robert Loftin and apparently a hobby of his is to write small web apps that tap into web service interfaces to gov't data. He's currently writing some sample code/apps to exercise the web services we've provided for DoE Fish & Wildlife data (cbfish.org).

    4 years ago
  8. RESTful services obviously have a place within the Data.gov API, but what might very well be extremely useful in that regard is to define more precisely what constitutes RESTful services, establish mechanisms for discovery of RESTful services, and establish both URL template formats and consistent message behavior for use with REST (for instance, what is the established mechanism for identifying the identifier of a recently POSTed resource).

    4 years ago