Federated Search Convention

From Earth Science Information Partners (ESIP)
Revision as of 13:35, September 28, 2012 by 76.72.166.150 (talk) (Reverted edits by 89.99.5.17 (talk) to last revision by 108.59.8.70)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Motivation

The ESIP Federated Search convention is designed to provide a lightweight standard protocol for supporting dataset and file-level searches throughout the ESIP federation.

Version

This is Version 1.0 of the ESIP Federated Search.

Overall Architecture

Reuse of Existing Standards

The Federated Search convention makes as much use as possible of the OpenSearch conventions documented at http://www.opensearch.org. This includes draft extensions for geospatial queries and extensions and for temporal queries.

The convention is also based on the Atom standard for responses.

Amendment of Standards

The above reused standards are amended in two ways, by restriction convention and extension. Restriction conventions are used to constrain syntax or semantics beyond that allowed by the standard. An example of this is the restriction that Atom responses are used for Federated Search (though the OpenSearch standard admits multiple different response formats). This is (mostly) intelligible to browser-based newsreaders (the lowest common denominator) while providing a relatively rich structure for parsing as well as accommodating domain-specific extensions. Extensions are used to add elements to the standard that do not exist, following the standard's methods for extensions to the maximum extent possible.

Namespace

The namespace for Federated Search extensions is: http://esipfed.org/ns/fedsearch/1.0/. Note that this URI is currently used for namespace definition only and does not return a valid document.

Time in Atom Response

Time is specified only for the query, not the response, in the draft Time extension to OpenSearch. The ESIP Federated Search convention is to represent Time of datasets or granules as the following:

  • The namespace is defined as xmlns:time="http://a9.com/-/opensearch/extensions/time/1.0/"
  • Time is represented as XML elements "start" and "stop" (following the draft for the Query), e.g.:
    • <time:start>YYYY-MM-DDTHH:SS:MMZ</time:start>
    • <time:stop>YYYY-MM-DDTHH:SS:MMZ</time:stop>
  • By convention, time is in Universal (Zulu) time, using the format YYYY-MM-DDTHH:MM:SS[.SSS]Z. Fractional seconds are optional.

Dataset-Level Queries and Responses

The Dataset-Level query may contain both Geo and Time extensions to OpenSearch, if the server supports them. The response to a Dataset-level query is an Atom document. This may include a number of links, but two links are required:

  1. the default link, which points to TBS
  2. the link to the OpenSearch Description Document for a granule-level query against the dataset. Following the OpenSearch convention for AutoDiscovery in RSS/Atom, the link must be identified by the rel and type attributes as:
rel="search" type="application/opensearchdescription+xml"

Granule-level Queries and Responses

Granule-level Links

Atom allows inclusion of multiple links for a given entry, a key advantage for the rich variety of manifestations for a given data granule, such as:

  • data
  • browse image
  • metadata file
  • OPeNDAP URL
  • On-the-fly format conversion

However, it is important for a client to be able to distinguish the different kinds of links, which can be done using the "rel" attribute. As the standard set of "rel" attributes (self, related, alternate, enclosure, via) is insufficiently rich, we extend the standard using URIs within the ESIP namespace. The convention for this namespace is still under discussion, with the current proposal being:

etc. This will eventually be linked to servicecasting namespace, so that services advertised through that mechanism can be referenced as types as well.

Implementation Guidelines

Appendices

Sample Server Implementation

Sample Client Implementation