Difference between revisions of "Federated Search Convention"

From Earth Science Information Partners (ESIP)
(Reverted edits by 89.99.5.17 (talk) to last revision by 108.59.8.70)
 
(18 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
= Motivation =
 
= Motivation =
 +
The ESIP Federated Search convention is designed to provide a lightweight standard protocol for supporting dataset and file-level searches throughout the ESIP federation.
 +
= Version =
 +
This is Version 1.0 of the ESIP Federated Search.
  
 
= Overall Architecture=
 
= Overall Architecture=
  
 
= Reuse of Existing Standards =
 
= Reuse of Existing Standards =
 +
The Federated Search convention makes as much use as possible of the OpenSearch conventions documented at [http://www.opensearch.org http://www.opensearch.org]. This includes draft extensions for [http://www.opensearch.org/Specifications/OpenSearch/Extensions/Geo/1.0/Draft_1 geospatial queries and extensions] and for [http://www.opensearch.org/Specifications/OpenSearch/Extensions/Geo/1.0/Draft_1 temporal queries].
  
= Modification of Standards =
+
The convention is also based on the [http://www.atomenabled.org/developers/syndication/atom-format-spec.php Atom standard] for responses.
  
== Restriction Conventions ==
+
= Amendment of Standards =
=== Response Formats ===
+
The above reused standards are amended in two ways, by restriction convention and extension.  Restriction conventions are used to constrain syntax or semantics beyond that allowed by the standard.  An example of this is the restriction that Atom responses are used for Federated Search (though the OpenSearch standard admits multiple different response formats).  This is (mostly) intelligible to browser-based newsreaders (the lowest common denominator) while providing a relatively rich structure for parsing as well as accommodating domain-specific  extensions.
Although OpenSearch allows a number of different formats (HTML, RSS, Atom), the ESIP Federated Search convention is to return an Atom response.  This is (mostly) intelligible to browser-based newsreaders (the lowest common denominator) while providing a relatively rich structure for parsing as well as accommodating domain-specific  extensions.
+
Extensions are used to add elements to the standard that do not exist, following the standard's methods for extensions to the maximum extent possible.
=== Time in Atom Response ===
+
== Namespace ==
The ESIP Federated Search convention is to represent Time in the Atom response as Universal (Zulu) time, using the format YYYY-MM-DDTHH:MM:SS[.SSS]Z. Fractional seconds are optional.
+
The namespace for Federated Search extensions is:
 +
'''http://esipfed.org/ns/fedsearch/1.0/'''.
 +
Note that this URI is currently used for namespace definition only and does not return a valid document.
  
== Extensions ==
+
== Time in Atom Response ==
 +
Time is specified only for the query, not the response, in the [http://www.opensearch.org/Specifications/OpenSearch/Extensions/Time/1.0/Draft_1 draft Time extension to OpenSearch].  The ESIP Federated Search convention is to represent Time of datasets or granules
 +
as the following:
 +
*The namespace is defined as xmlns:time="http://a9.com/-/opensearch/extensions/time/1.0/"
 +
*Time is represented as XML elements "start" and "stop" (following the draft for the Query), e.g.:
 +
**<time:start>YYYY-MM-DDTHH:SS:MMZ</time:start>
 +
**<time:stop>YYYY-MM-DDTHH:SS:MMZ</time:stop>
 +
*By convention, time is in Universal (Zulu) time, using the format YYYY-MM-DDTHH:MM:SS[.SSS]Z.  Fractional seconds are optional.
 +
== Dataset-Level Queries and Responses ==
 +
The Dataset-Level query may contain both Geo and Time extensions to OpenSearch, if the server supports them.  The response to a Dataset-level query is an Atom document. This may include a number of links, but two links are required:
 +
# the default link, which points to '''TBS'''
 +
# the link to the OpenSearch Description Document for a granule-level query against the dataset.  Following the [http://www.opensearch.org/Specifications/OpenSearch/1.1#Autodiscovery_in_RSS.2FAtom OpenSearch convention for AutoDiscovery in RSS/Atom], the link must be identified by the ''rel'' and ''type'' attributes as:
 +
rel="search" type="application/opensearchdescription+xml"
 +
 
 +
== Granule-level Queries and Responses ==
 +
 
 +
=== Granule-level Links ===
 +
Atom allows inclusion of multiple links for a given entry, a key advantage for the rich variety of manifestations for a given data granule, such as:
 +
*data
 +
*browse image
 +
*metadata file
 +
*OPeNDAP URL
 +
*On-the-fly format conversion
 +
However, it is important for a client to be able to distinguish the different kinds of links, which can be done using the "rel" attribute.  As the standard set of "rel" attributes (self, related, alternate, enclosure, via) is insufficiently rich, we extend the standard using URIs within the ESIP namespace.  The convention for this namespace is still under discussion, with the current proposal being:
 +
*http://esipfed.org/ns/fedsearch/1.0/data#
 +
*http://esipfed.org/ns/fedsearch/1.0/browse#
 +
*http://esipfed.org/ns/fedsearch/1.0/metadata#
 +
*http://esipfed.org/ns/fedsearch/1.0/opendap#
 +
etc.
 +
This will eventually be linked to servicecasting namespace, so that services advertised through that mechanism can be referenced as types as well.
  
 
= Implementation Guidelines =
 
= Implementation Guidelines =

Latest revision as of 13:35, September 28, 2012

Motivation

The ESIP Federated Search convention is designed to provide a lightweight standard protocol for supporting dataset and file-level searches throughout the ESIP federation.

Version

This is Version 1.0 of the ESIP Federated Search.

Overall Architecture

Reuse of Existing Standards

The Federated Search convention makes as much use as possible of the OpenSearch conventions documented at http://www.opensearch.org. This includes draft extensions for geospatial queries and extensions and for temporal queries.

The convention is also based on the Atom standard for responses.

Amendment of Standards

The above reused standards are amended in two ways, by restriction convention and extension. Restriction conventions are used to constrain syntax or semantics beyond that allowed by the standard. An example of this is the restriction that Atom responses are used for Federated Search (though the OpenSearch standard admits multiple different response formats). This is (mostly) intelligible to browser-based newsreaders (the lowest common denominator) while providing a relatively rich structure for parsing as well as accommodating domain-specific extensions. Extensions are used to add elements to the standard that do not exist, following the standard's methods for extensions to the maximum extent possible.

Namespace

The namespace for Federated Search extensions is: http://esipfed.org/ns/fedsearch/1.0/. Note that this URI is currently used for namespace definition only and does not return a valid document.

Time in Atom Response

Time is specified only for the query, not the response, in the draft Time extension to OpenSearch. The ESIP Federated Search convention is to represent Time of datasets or granules as the following:

  • The namespace is defined as xmlns:time="http://a9.com/-/opensearch/extensions/time/1.0/"
  • Time is represented as XML elements "start" and "stop" (following the draft for the Query), e.g.:
    • <time:start>YYYY-MM-DDTHH:SS:MMZ</time:start>
    • <time:stop>YYYY-MM-DDTHH:SS:MMZ</time:stop>
  • By convention, time is in Universal (Zulu) time, using the format YYYY-MM-DDTHH:MM:SS[.SSS]Z. Fractional seconds are optional.

Dataset-Level Queries and Responses

The Dataset-Level query may contain both Geo and Time extensions to OpenSearch, if the server supports them. The response to a Dataset-level query is an Atom document. This may include a number of links, but two links are required:

  1. the default link, which points to TBS
  2. the link to the OpenSearch Description Document for a granule-level query against the dataset. Following the OpenSearch convention for AutoDiscovery in RSS/Atom, the link must be identified by the rel and type attributes as:
rel="search" type="application/opensearchdescription+xml"

Granule-level Queries and Responses

Granule-level Links

Atom allows inclusion of multiple links for a given entry, a key advantage for the rich variety of manifestations for a given data granule, such as:

  • data
  • browse image
  • metadata file
  • OPeNDAP URL
  • On-the-fly format conversion

However, it is important for a client to be able to distinguish the different kinds of links, which can be done using the "rel" attribute. As the standard set of "rel" attributes (self, related, alternate, enclosure, via) is insufficiently rich, we extend the standard using URIs within the ESIP namespace. The convention for this namespace is still under discussion, with the current proposal being:

etc. This will eventually be linked to servicecasting namespace, so that services advertised through that mechanism can be referenced as types as well.

Implementation Guidelines

Appendices

Sample Server Implementation

Sample Client Implementation