Difference between revisions of "Discovery Hack-a-thon Details"

From Earth Science Information Partners (ESIP)
(Reverted edits by Nchung (talk) to last revision by Ctilmes)
 
Line 113: Line 113:
  
 
'''Mentors:''' Nga Chung and ...
 
'''Mentors:''' Nga Chung and ...
 
Example of Python script for dataset level search:
 
 
<pre>
 
from xml.etree.ElementTree import parse
 
import sys
 
import urllib
 
 
#Build OpenSearch URL with command line argument as keyword value
 
url = "http://podaac.jpl.nasa.gov:8890/ws/search/dataset?"
 
url += urllib.urlencode({'keyword': sys.argv[1]})
 
 
namespace = {"opensearch": "http://a9.com/-/spec/opensearch/1.1/",
 
            "atom": "http://www.w3.org/2005/Atom"}
 
 
xml = parse(urllib.urlopen(url))
 
items = xml.findall('{%(atom)s}entry' % namespace)
 
for elem in items:
 
    title = elem.find("{%(atom)s}title" % namespace).text.strip()
 
    print 'Title: ' +  title
 
   
 
    description = elem.find("{%(atom)s}content" % namespace).text.strip()
 
    print 'Description: ' + description
 
   
 
    link = elem.find("{%(atom)s}link[@rel='search']" % namespace).attrib['href']
 
    print 'Link: ' +  link + '\n'
 
</pre>
 
 
To search for sea surface temperature datasets and print title, description, and granule search link, execute script with command line argument "sea surface temperature".
 
 
Example of Python script for granule level search:
 
 
<pre>
 
from xml.etree.ElementTree import parse
 
from datetime import datetime, timedelta
 
import sys
 
import urllib
 
 
yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
 
start = yesterday + "T00:00:00"
 
stop = yesterday + "T23:59:59"
 
 
#Build OpenSearch URL with parameters shortName, startTime, endTime
 
url = "http://podaac.jpl.nasa.gov:8890/ws/search/granule?"
 
url += urllib.urlencode({'shortName': sys.argv[1], 'startTime': start, 'endTime': stop})
 
 
namespace = {"opensearch": "http://a9.com/-/spec/opensearch/1.1/",
 
            "atom": "http://www.w3.org/2005/Atom"}
 
 
while url is not None: #Grab next page of results while next page exists
 
    xml = parse(urllib.urlopen(url))
 
    items = xml.findall('{%(atom)s}entry' % namespace)
 
    for elem in items:
 
        link = elem.find("{%(atom)s}link[@title='FTP URL']" % namespace).attrib['href']
 
        newfile = link.split("/")[-1]
 
        print 'Downloading: ' +  newfile
 
        urllib.urlretrieve(link, newfile)
 
   
 
    next = xml.find("{%(atom)s}link[@rel='next']" % namespace)
 
    if next is None:
 
        url = None
 
    else:
 
        url = next.attrib['href']
 
</pre>
 
 
To download all JPL-L2P-MODIS_A granules from yesterday, execute script with command line argument "JPL-L2P-MODIS_A".
 
  
 
==== Java Programmers ====
 
==== Java Programmers ====

Latest revision as of 18:58, July 22, 2012

Overview

Overview Slides

What's the Plan? Get together to make some simple Discovery clients! All are welcome, no previous experience or coding skills necessary!

  • Two back-to-back sessions:
  1. Tuesday, July 17, 2012. 1:30pm-3:00pm
  2. Tuesday, July 17, 2012. 3:30pm-5:00pm

Abstract: The set of ESIP Discovery services encompass the overlapping conventions of Earth science federated OpenSearch, Collection Casting, Granule Casting, and Service Casting feed standards. To help lower the barrier of entry, we will provide a set of hands-on and simple approaches to using Discovery services. These include walking through some "low-hanging fruit" approaches to calling OpenSearch, Collection Casting, Granule Casting, and Service Casting.

Tuesday, July 17, 2012. 1:30pm-3:00pm: Non-Coders

Discovery Hack-a-thon Overview (20-mins)

Quick overview of ESIP Discovery services and set the stage for call the ESIP Discovery services from various simple clients.

Poll the audience for interest.

Hack-a-thon Breakout (70-mins)

Geoportal

We will have a Geoportal from our testbed up and running. You'll learn what protocols the geoportal already supports for service validation, how to use the geoportal to validate and register data and services, and how to add configurations for validation to a geoportal instance. Or, just take the Discovery Cluster geoportal instance out for a spin!

Mentor: Christine White

Browsers and News Readers

It's even possible to interact with both Data "casts" and OpenSearch servers using a simple browser. Also, News Readers, especially useful for Data casts!

Mentors: Ruth Duerr and ...

Cast Publishing and Aggregation

Learn how to use existing web apps create 1-off data casts. Come, create a cast or two and then see it found (that's the idea anyway).

Mentors: Ruth Duerr and ...

Command line

Yes, you can interact with an OpenSearch server or a Data Cast using basic command-line URL getters, like wget (available for all platforms) and curl.

Mentors: Hook Hua and Chris Lynnes


Tuesday, July 17, 2012. 3:30pm-5:00pm: Coders

Discovery Hack-a-thon Overview (15-mins)

Quick overview of ESIP Discovery services and set the stage for call the ESIP Discovery services from various simple clients.

Poll the audience for interest.

Hack-a-thon Breakout (75-mins)

Perl Monks

Learn how to hack a quick client together. Particularly useful for doing scripted search/acquire for datasets following your own, possibly idiosyncratic needs.

Sample code: Media:esip_fedsearch2.pl.txt

Mentors: Chris Lynnes and Brian Duggan

[Mojolicious] is a nice tool for interacting with web services. Install it like so:

 curl http://get.mojolicio.us | sh

If you are not root, also do this :

  cpanm --local-lib=~/perl5 local::lib && eval $(perl -I ~/perl5/lib/perl5/ -Mlocal::lib)

Then you will have the command line tool "mojo" which can send requests and extract using CSS selectors. For more documentation try:

 mojo help get
 perldoc Mojo::DOM
 perldoc Mojo::UserAgent

Here are some sample queries:

 # dataset level
 mojo get http://mirador.gsfc.nasa.gov/mirador_dataset_opensearch.xml 'OpenSearchDescription > Url' attr template 
 http://mirador.gsfc.nasa.gov/cgi-bin/mirador/collectionlist.pl?keyword={searchTerms}&page=1&count={count}&osLocation={geo:box}&startTime={time:start}&endTime={time:end}&format=rss
 http://mirador.gsfc.nasa.gov/cgi-bin/mirador/collectionlist.pl?keyword={searchTerms}&page=1&count={count}&osLocation={geo:box}&startTime={time:start}&endTime={time:end}&format=atom
 # collection level search  
 mojo get 'http://mirador.gsfc.nasa.gov/cgi-bin/mirador/collectionlist.pl?keyword=ozone;format=atom' 'entry > link[type="application/opensearchdescription+xml"]' attr href
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_OMTO3.003.xml
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_SSBUVO3.008.xml
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_SBUV2N09O3.008.xml
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_SBUV2N11O3.008.xml
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_SBUV2N16O3.008.xml
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_OMTO3G.003.xml
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_OMDOAO3Z.003.xml
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_SBUVN7O3.008.xml
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_OMDOAO3G.003.xml
 http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_OMDOAO3.003.xml
 # Granule level search
 mojo get http://mirador.gsfc.nasa.gov/OpenSearch/mirador_opensearch_OMTO3.003.xml 'Url[type="application/atom+xml"]' attr template
 http://mirador.gsfc.nasa.gov/cgi-bin/mirador/granlist.pl?dataSet=OMTO3.003&page=1&maxgranules={os:count}&osLocation={geo:box}&order=a&endTime={time:end}&startTime={time:start}&format=atom
 # Get the url for a specific granule
 mojo get 'http://mirador.gsfc.nasa.gov/cgi-bin/mirador/granlist.pl?format=rss&startTime=2010-01-01&endTime=2010-01-02&order=a&osLocation=&maxgranules=1&page=1&dataSet=OMTO3.003' 'item > link' text
 ftp://aurapar2u.ecs.nasa.gov/data/s4pa///Aura_OMI_Level2/OMTO3.003//2009/365/OMI-Aura_L2-OMTO3_2009m1231t2226-o29062_v003-2012m0331t213901.he5
 http://aurapar2u.ecs.nasa.gov/opendap/Aura_OMI_Level2/OMTO3.003//2009/365/OMI-Aura_L2-OMTO3_2009m1231t2226-o29062_v003-2012m0331t213901.he5
 # Get the granule itself
 mojo get http://aurapar2u.ecs.nasa.gov/opendap/Aura_OMI_Level2/OMTO3.003//2009/365/OMI-Aura_L2-OMTO3_2009m1231t2226-o29062_v003-2012m0331t213901.he5 > data.he5

Python Scripters

Learn how to use Python, an easy to use, but powerful scripting language, to interact with OpenSearch servers and find the data you need.

Mentors: Nga Chung and ...

Java Programmers

Mentors: Eric Rozell and ...

XSL Transforms

Learn how to generate simple interfaces to OpenSearch cast feeds. ESIP Discovery casts extend the Atom casts format and thus these transforms will expose additional information to the user interface.

The XSLT processors we'll try out are

  1. your modern browser
  2. Saxon-HE 9.4 (open source home edition)


Getting Saxon


A complete working example tutorial of an XSL to process Atom 1.0 to XHTML


Edit the xsl to add namespaces relevant for OpenSearch. Add namespaces to the top-level <xsl:stylesheet> element:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:a="http://www.w3.org/2005/Atom"
  xmlns:xhtml="http://www.w3.org/1999/xhtml" 
  xmlns="http://www.w3.org/1999/xhtml" 
  xmlns:time="http://a9.com/-/opensearch/extensions/time/1.0/"
  xmlns:geo="http://a9.com/-/opensearch/extensions/geo/1.0/"
  exclude-result-prefixes="a xhtml time">

...


Download a sample ESIP OpenSearch granule response.

Edit the OpenSearch response to include the sytle-sheet directive.

<?xml version='1.0' encoding='UTF-8'?>

<?xml-stylesheet type="text/xsl" href="opensearch2xhtml.xsl"?>

<feed xmlns="http://www.w3.org/2005/Atom" xmlns:os="http://a9.com/-/spec/opensearch/1.1/" xmlns:time="http://a9.com/-/opensearch/extensions/time/1.0/" xmlns:geo="http://a9.com/-/opensearch/extensions/geo/1.0/" xmlns:nsidc="http://nsidc.org/ns/opensearch/1.1/">

...

Open the local OpenSearch feed xml with your browser or transform with Saxon command-line.


Saxon command-line interface

# -t  Display version and timing information to the standard error output. The output also traces the files that are read and writing, and extension modules that are loaded.
# -tree:(linked|tiny|tinyc)  Selects the implementation of the internal tree model. -tree:tiny selects the "tiny tree" model (the default). -tree:linked selects the linked tree model. -tree:tinyc selects the "condensed tiny tree" model. 

java -cp saxon9he.jar  net.sf.saxon.Transform  -tree:linked -t -s:{source-xml-filename}  -xsl:{xsl-filename}  -o:{output-filename}


Mentors: Hook Hua and ...

Hack-a-thon Resources

Casts

Collection Casts

Granule-level Casts

Service Casts

OpenSearch Description Documents

Top Level (search for datasets)

Granule-level Examples

Sample Code

Example URLs

OpenSearch

How-To Guides


Back to Discovery_Hack-a-thon