Toward more effective means of data discovery, particularly as more and more datasets are published and registered in data.gov, one thing that might provide value, particularly to, for example, the science community, might be integration and use of existing taxonomies, ontologies and thesauri as developed by various agencies.
Models for integration exist, such as CUAHSI in the water community, or GBIF for biodiversity. GBIF, for example, provides web services to provide useful taxonomic data lookup - as an example, one can enter "branta canadensis" and retrieve common names and variants, such as English: Canada Goose, French: Bernache Du Canada and Spanish: Ganso Canadiense along with the taxonomy: Kingdom: Animalia Phylum: Chordata Class: Aves Order: Anseriformes Family: Anatidae Genus: Branta Species: Branta canadensis
This type of thing can be integrated into a search box via AJAX to facilitate more robust searches, or to dynamically broaden searches and navigate for greater or lesser granularity, such as popping up a few levels to look at Anatidae. This can be integrated not only into the search interface, but also plugged into the tools being used to tag data and author metadata, to provide robust, consistent metadata records with useful degrees of thematic granularity, consistent thesauri and terminology within the subject matter domains of interest.