How-To Guide for Implementing ESIP Federated Search Servers

From Earth Science Information Partners (ESIP)

Introduction

ESIP Federated Search is a simple framework for doing a federated (distributed) query among participating members for Earth science data. It is based on the OpenSearch convention for distributed searches, which centers around OpenSearch Description Documents. These XML documents include a template that shows how to construct a URL in order to execute a query against a particular search engine.

The ESIP Federated Search includes certain conventions to support a two-step dataset/granule (file) query. In the first step, a keyword query (and sometimes space-time criteria) is issued for datasets. In the second step, each selected dataset is queried for granules matching space-time (and possibly keyword) criteria.

Esip 2step.png

The key element linking the two steps is an OpenSearch Description document for each dataset which describes how to do the granule (file) search for that dataset.

What Do I Need for an ESIP Federated Search Server?

In a nutshell, you need four things:

  1. A dataset search engine that supports at least keyword (free-text) search, and optionally space-time constraints
  2. An OpenSearch Description Document describing the dataset search engine
  3. A granule-level search engine supporting space-time query
  4. An OpenSearch Description Document for each dataset that describes the granule-level search template for that dataset

Let's look at each of these in detail.

Dataset Search Engine

The Dataset Search Engine should return an Atom document with the dataset results.

<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:georss="http://www.georss.org/georss" 
            xmlns:geo="http://a9.com/-/opensearch/extensions/geo/1.0/" xmlns:time="http://a9.com/-/opensearch/extensions/time/1.0/">
<author><name>GES DISC</name><email>mirador-disc@listserv.gsfc.nasa.gov</email></author>
<opensearch:itemsPerPage/><updated>2009-11-18T18:30:02Z</updated>
<title>Mirador collection results for Monoxide</title>
<id>http://mirador.gsfc.nasa.gov/cgi-bin/mirador/collectionlist.pl</id>
<subtitle type="html">Monoxide (distributed by GES DISC)
        </subtitle>
<link rel="self" href="http://mirador.gsfc.nasa.gov/cgi-bin/mirador/collectionlist.pl"/>
<link rel="http://esipfed.org/ns/fedsearch/1.0/search#" href="http://mirador.gsfc.nasa.gov/cgi-bin/mirador/granlist.pl"/>
<entry>
<id>http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_ML2CO.002.xml</id>
<updated>2009-11-18T18:30:02Z</updated>
<author><name>GES DISC</name><email>mirador-disc@listserv.gsfc.nasa.gov</email></author>
<title>MLS/Aura L2 Carbon Monoxide (CO) Mixing Ratio (ML2CO) </title>
<link href="http://mirador.gsfc.nasa.gov/cgi-bin/mirador/granlist.pl?page=1&dataSet=ML2CO&version=002&allversion=002&keyword=Monoxide&pointLocation=(-90,-180),(90,180)&location=(-90,-180),(90,180)&searchType=Location&event=&startTime=2009-10-10&endTime=2009-10-11 23:59:59&search=&CGISESSID=f408c488319554acd03731525a55f5a8&nr=4&temporalres=1%20Day(s)&prodpg=http://mirador.gsfc.nasa.gov/collections/ML2CO__002.shtml&longname=MLS/Aura L2 Carbon Monoxide (CO) Mixing Ratio&granulePresentation=ungrouped" rel="http://esipfed.org/ns/fedsearch/1.0/data#"/>
<time:start>2004-08-08</time:start>
<time:end>2009-12-14</time:end>
<summary type="html">Dataset:ML2CO.002(1)</summary>
<link rel="search" type="application/opensearchdescription+xml" title="ML2CO.002" href="http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_ML2CO.002.xml"/>
<link rel="enclosure" type="text/html" href="http://mirador.gsfc.nasa.gov/collections/ML2CO__002.shtml" title="/OpenSearch/mirador_opensearch_ML2CO.002.xml info"/>
</entry>

Note especially the <link> entry with type "application/opensearchdescription+xml". This is the URL to the OpenSearch Description Document describing how to construct a URL to search for ML2CO.002 data.