Difference between revisions of "Summer 2010 Technical Workshops"

From Earth Science Information Partners (ESIP)
 
(21 intermediate revisions by 7 users not shown)
Line 15: Line 15:
  
  
=Suggested Workshops=
+
=Workshops=
  
 
==Data Visualization Boot Camp==
 
==Data Visualization Boot Camp==
Line 26: Line 26:
 
==Semantic Web Workshop==
 
==Semantic Web Workshop==
  
     * Semantic Web - Three part presentation
+
     * Please note - this is moved to [http://wiki.esipfed.org/index.php/Semantic_Web_2010_Technical_Workshop]
    * Peter Fox, Rahul Ramachandran, Hook Hua
 
  
 
+
* Add special interest topics here
Lecture 1: Practical aspects of creating semantic web applications  by Peter Fox
+
* Semantic Web external web page for NASA TIWG [http://tiwg.wik.is/Semantic_Web/Semantic_Web_External_Page]
 
 
This part of the  workshop is targeted at audience that spans the mid-advanced beginner to intermediate level
 
interested in what an initial end-to-end prototype implementation of a semantic web application
 
would consist of. The presentation will start with a use case, decompose it along with needed vocabularies,
 
and relationships, perform the initial modeling, reuse and engineering steps for the ontology, using
 
tools such as Cmap and Protege and proceed to instance generation, ontology validation and
 
verification, prototyping triple store, query and programming language and inference choices
 
that may need to be made. The participant should come away with the general method and
 
suitable initial prototyping choices that can be made when starting semantic web applications so
 
that long development cycles are minimized.
 
 
 
Lecture 2: Building a Linked Data Cloud for Earth Science by Rahul Ramachandran
 
 
 
Semantic Web isn’t just about putting data on the web. It is about making links so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related data" - Tim Berners-Lee.  
 
 
 
This presentation will look at what is linked data? How to publish and consume data in this cloud? Some existing examples and research applications will be presented. Finally, this presentation will focus on Earth Science specific issues related to Linked Data such as existing technology components and hurdles.
 
 
 
Lecture 3: Semantic Web Technologies for Addressing Knowledge Gaps in Using Science Applications by Hook Hua
 
 
 
Many science data processing suites have grown in complexity where no
 
single person may fully understand all aspects of the system. Developed over
 
the course of years (or decades) and by many groups of people, these
 
processing software often encompass a plethora of processing steps and data
 
products. We will present an overview of various semantically-driven
 
technologies and techniques that can address this knowledge gap issue in the
 
scope of InSAR data processing. We will cover ontology development and how
 
it can be applied with inferencing, rules, query, and natural language
 
processing to help the non-expert users.
 
  
 
==Collaborative Technologies and Geoportal Extension==
 
==Collaborative Technologies and Geoportal Extension==
Line 71: Line 42:
 
Intended Audience:  All, particularly organizations that produce or manage geospatial datasets.  
 
Intended Audience:  All, particularly organizations that produce or manage geospatial datasets.  
  
==Federated Search==
+
==ESIP Federated Search==
  
    * ESIP's Federated Search
+
* Workshop lead: Chris Lynnes
    * Workshop lead: Chris Lynnes
+
* [[ESIP_FedSearchWorkshop_2010|Agenda]]
    * [[ESIP_FedSearchWorkshop_2010|Agenda]]
 
  
 
Abstract:
 
Abstract:
Line 87: Line 57:
 
Intended Audience:  Both Federated Search newbies (esp. for the first half) and Federated Search cluster members (second half).  Newbies are welcome to watch some of the sausage-making in the second half.
 
Intended Audience:  Both Federated Search newbies (esp. for the first half) and Federated Search cluster members (second half).  Newbies are welcome to watch some of the sausage-making in the second half.
  
Workshop Category:  Presentation Track.  
+
Workshop Category:  Presentation Track.
  
 
==Interoperability 101==
 
==Interoperability 101==
 +
* Focus: Survey course for new members to introduce different interoperability standards
 +
* Workshop lead: Karl Benedict
 +
 +
Abstract:
 +
 +
This workshop will provide an overview of key interoperability standards and protocols that are important in Earth science data and information exchange and data processing. The standards discussed in the workshop will include the following:
  
    * Focus: Survey course for new members to introduce different interoperability standards
+
*Open Geospatial Consortium: Web Map, Web Feature, and Web Coverage Services (WMS, WFS, and WCS respectively), Catalog Services for Web (CSW), and Sensor Web Enablement (SWE) suite of standards [[http://wiki.esipfed.org/images/a/a8/Interop101_ogc.pdf presentation PDF file]](Karl Benedict)
    * Workshop lead: Karl Bennedict
 
  
 +
* [[Making Science Data Easier to Use with OPeNDAP]] (Chris Lynnes)
 +
 +
*Pomegranate as an example of webification (w10n) that enables easier access of remote science data in RESTful way [http://pomegranate.jpl.nasa.gov/talk/w10n-sci.p9e.esip2010.ppt ppt] (Xing)
  
 
==Climate Modeling==
 
==Climate Modeling==
  
 
     * Focus: The use of climate models
 
     * Focus: The use of climate models
     * Workshop lead: Rob Raskin to recruit from ORNL
+
     * Workshop lead: Luca Cinquini, Auroop Ganguly, and Raju Vatsavai
 +
 
 +
Part 1: The Earth System Grid Federation: Building a Global Distributed Infrastructure
 +
for Climate Change Research
 +
 
 +
Presenter: Luca Cinquini
 +
 
 +
Abstract:
 +
 
 +
The Earth System Grid (ESG) project is building a global infrastructure to enable the scientific community with unprecedented access to massive, distributed, heterogenous datasets for climate change research. We will describe the ESG software architecture and demonstrate some of its federation functionality such as single sign on across gateways, metadata exchange, and distributed data access. We will present a status update on the most current ESG activities, such as: the coordination and preparation for the CMIP5/IPCC-AR5 archive; the expansion of the ESG federation to other federal agencies such as NASA and NOAA and international partners such as BADC and DKRZ; and the inclusion of observational datasets alongside the CMIP5 model output. Finally, we will outline some future lines of development within ESG, such as the movement towards an open source model, a shift in the architectural paradigm, and the integration with middleware for scientific analysis such as the Climate Data Exchange (CDX) services and toolkit.
 +
 
 +
 
 +
Part 2: A "new kind" of knowledge discovery: Case study on climate change
 +
 
 +
Presenter: Auroop Ganguly
 +
 
 +
Abstract:
 +
 
 +
A new kind of knowledge discovery is presented which integrates interdisciplinary computational data sciences, process models and decision sciences to provide actionable predictive insights on urgent societal priorities in multidisciplinary settings. The focus is on extremes, nonlinear processes, rare events and change, with a translation to uncertainty and risk assessments. The need for innovative mathematical and computational solutions is motivated along with the ability to discover insights from multisource and heterogeneous data. While there is a need to work with massive data volumes, the data generation processes can be noisy and nonlinear, and the data relevant for analysis may be short compared to the length required for analysis. A case study is presented for knowledge discovery in climate, with a particular emphasis on how data-guided insights can complement physics-based computational models to generate credible predictive insights about climate extremes and regional change along with an assessment of the reliability of the climate projections.
 +
 
 +
 
 +
Part 3: Large Scale Remote Sensing Data Mining
 +
 
 +
Presenter: Raju Vatsavai
 +
 
 +
Abstract:
 +
 
 +
Biomass monitoring over large geographic regions using remote sensing poses several challenges. Existing change detection techniques are not adequate or scalable for continuous monitoring. On the other hand characterizing changes requires accurate classification of remote sensing images. Supervised classification over large geographic regions poses the following challenges: i) inadequate ground truth data, ii) aggregate challenges, iii) spatial homogeneity, and iv) spatial heterogeneity. In this presentation, we discuss recent advances in spatiotemporal data mining, especially the techniques that exploit the subtle multidimensional signals through the joint use of high temporal resolution (MODIS) data and moderate- and fine-spatial resolution (AWiFS) satellite images for extracting multi-temporal biomass change information, including crop types and their conditions. Specifically we discuss Gaussian Process (GP) based classification and change detection techniques, semi-supervised and sub-class classification techniques and their spatial extensions. In addition, we will discuss computational challenges in scaling these algorithms for large geographic regions and provide recent results on multi-core and many core architectures.
  
 
==Drupal in a Day==
 
==Drupal in a Day==
 
      
 
      
 
     * Workshop lead: Sunil Movva, Jerry Pan and Giri Palanisamy
 
     * Workshop lead: Sunil Movva, Jerry Pan and Giri Palanisamy
    * Part 1: Learn the basics
+
 
    * Part 2: Earth Science customizations
+
'''Abstract'''
 +
 
 +
The "Drupal in a Day – Part 1" workshop will cover the basics of the Drupal content management system. It will cover the terminology and fundamental concepts of Drupal. You will learn how to setup and manage a Drupal based website, and you will do it on your laptop.  Topics include:
 +
 
 +
1) What is Drupal and why you care
 +
 
 +
2) Setting up a Drupal website
 +
 
 +
3) Building a Drupal website, using Drupal administration user interface
 +
 
 +
4) Administering a Drupal site
 +
 
 +
5) Drupal Theme Concepts
 +
 
 +
The “Drupal in a Day – Part 2” workshop will walk you through extending Drupal’s core functionality with contributed modules. You will learn how to create custom content types with CCK and later will be introduced to the Drupal architecture and APIs essential for custom module development. Topics include:
 +
 
 +
1) Contributed modules
 +
 
 +
2) Creating custom content types
 +
 
 +
3) A brief introduction to module development
 +
 
 +
 
 +
'''Intended Audience:'''
 +
This workshop are geared toward all audience, including data managers, scientists, and developers. No prior Drupal knowledge is required. Programming for Drupal is covered as an advanced topic, the rest of the workshop does not invGolve programming.
 +
 
 +
'''Workshop Category:'''
 +
Working meeting: overview and tutorials with hands on participation.
  
 
==Service & Data Casting==
 
==Service & Data Casting==
Line 116: Line 148:
 
and data granules using service, dataset, and granule
 
and data granules using service, dataset, and granule
 
casts (RSS or Atom feeds).  Brian will present on service
 
casts (RSS or Atom feeds).  Brian will present on service
casting, and Ruth Duerr will demo some dataset casting.
+
casting, Ruth Duerr will demo some dataset casting, and
 +
Andy Bingham will discuss JPL's Datacasting efforts.
 
Multiple groups are doing this kind of casting, so we
 
Multiple groups are doing this kind of casting, so we
 
will discuss intended uses, how to agree on these formats (use of georss and other metadata tags), and coordinate efforts to minimize redundant work.
 
will discuss intended uses, how to agree on these formats (use of georss and other metadata tags), and coordinate efforts to minimize redundant work.
Line 125: Line 158:
 
lightweight, searchable metadata for services and
 
lightweight, searchable metadata for services and
 
datasets in these 'casting' formats.
 
datasets in these 'casting' formats.
 +
 +
== Cloud Computing ==
 +
Workshop Lead: Dr. Chaowei Phil Yang
 +
 +
Cloud computing is a new computing paradigm that has the advantage of using computing as utility and exemplified through Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). Federal agencies are asked to report how they are using cloud and why they are in the next years. This workshop will introduce the concept of cloud computing and using an Earth science application as example to illustrate how cloud computing can be leveraged for Earth sciences.
 +
 +
==OODT==
 +
 +
    * Workshop lead:  Chris Mattmann
 +
    * URL:  http://oodt.jpl.nasa.gov
 +
 +
Abstract:
 +
 +
The Object Oriented Data Technology (OODT) middleware is an independent set of reusable software components that can be deployed, tailored and configured to build out scalable data management systems. Complete with data curation/ingestion, processing and dissemination functionality, along with the ability to be deployed onto heterogeneous hardware environments including grids and clouds, OODT has been heavily deployed over the last 10 years at a number of government agencies (NASA, NIH, DoD), universities, and other institutions across the world.
 +
 +
OODT is the flagship technology that powers the NASA Planetary Data System (PDS), the NCI Early Detection Research Network (EDRN) project, and data processing systems for current NASA Earth science missions including OCO/ACOS, NPP Sounder PEATE, and the Soil Moisture Active Passive (SMAP) mission. In addition, OODT is being used in Earth science research projects, including the ACCESS07 Virtual Oceanographic Data Center (VODC) project, the ACCESS09 funded Coastal Marine Discovery System (CDMS) project, and the 2009-funded JPL-led Climate Data eXchange (CDX) project, an effort to share NASA observational data with the Earth System Grid.
 +
 +
I’ll cover the core components of OODT, its architecture, implementation, and recent movement to the Apache Software Foundation (ASF). OODT is the first NASA project to be hosted at Apache. I’ll discuss our efforts to pave the way in making NASA a vibrant member of the open source community.
 +
 +
==Metadata Standards for GEOSS==
 +
 +
    * Workshop lead:  Phil Yang
 +
 +
Abstract:
 +
 +
Metadata is critical in archiving, searching, finding, locating, and utilizing geospatial resources for scientific research. We will introduce metadata standards and their value in building the GEOSS clearinghouse.  The clearinghouse utilizes FGDC, ISO, and other community metadata standards and encodes in various schema using XML for interoperability among registry and catalogs, including the GEOSS “Component and Service Registry”, Geospatial One Stop, INSPIRE, Geoconnect and other catalogs. The workshop will also introduce problems encountered in metadata use, such as uniqueness and lack of important items for better metadata utilization. This research, development, and Technical Workshop is sponsored by FGDC CAP program and NASA spatial interoperability program.

Latest revision as of 06:40, July 21, 2010

Updated: 05/21/2010 Rahul Ramachandran


Workshop Abstract Proposal Template (Need Feedback)

   * Description:
   * Audience:
   * Workshop Category Type:
         o Working meetings – working meeting that are set up more like tutorials with hands on participation
         o Presentation track
   * Special requirements:
         o Room Configuration Requirements
         o Equipment requirements
   * Proposed Time Length (90 Minutes max):


Workshops

Data Visualization Boot Camp

   * Focus: Follow on from last summer’s meeting
   * Workshop lead: Bruce Caron

Abstract: Many ESIP partners process datasets into visualizations for scientific investigation or as information for end users. Using better practices in data visualization will lead to more effective science and communication. The ESIP Summer meetings are a good place for earth data visualizers to share their successful practices and to build a set of best practice models for earth data visualizations. This workshop will bring together top data visualizers from several ESIPs, and others looking to learn more about visualization tools and techniques. The goal is to start a visualization resource that can grow over time to become an educational tool for Federation and the larger earth science data community.

Semantic Web Workshop

   * Please note - this is moved to [1] 
  • Add special interest topics here
  • Semantic Web external web page for NASA TIWG [2]

Collaborative Technologies and Geoportal Extension

   * Exposing Data with Web Services
   * Workshop Lead: Christine Eggers ESRI

Abstract:

It is no secret that exposing geospatial datasets as web services makes them easier for users to discover and use. Web services allow users to access and then combine different datasets for their own purposes and in their own map viewers, often in creative ways. Even so, the standards, organizational policies, and technical aspects of how to move datasets to web services can be daunting. This workshop investigates the value of web services, how to expose datasets as standards-based web services, and tools that enable users to discover web-accessible data resources.

Intended Audience: All, particularly organizations that produce or manage geospatial datasets.

ESIP Federated Search

  • Workshop lead: Chris Lynnes
  • Agenda

Abstract:

ESIP's Federated Search cluster is developing a framework to support space-time and keyword search for data. The aim is to include both small and large data providers, both data and services (eventually), and both dataset and file-level searches. The approach is to leverage the OpenSearch standard, layering an ESIP-specific convention on top.

This workshop will include:

  • an introduction and demos of existing Federated Search implementations (yes, they exist!)
  • work on details of the ESIP convention and how to foster client and server development

Intended Audience: Both Federated Search newbies (esp. for the first half) and Federated Search cluster members (second half). Newbies are welcome to watch some of the sausage-making in the second half.

Workshop Category: Presentation Track.

Interoperability 101

  • Focus: Survey course for new members to introduce different interoperability standards
  • Workshop lead: Karl Benedict

Abstract:

This workshop will provide an overview of key interoperability standards and protocols that are important in Earth science data and information exchange and data processing. The standards discussed in the workshop will include the following:

  • Open Geospatial Consortium: Web Map, Web Feature, and Web Coverage Services (WMS, WFS, and WCS respectively), Catalog Services for Web (CSW), and Sensor Web Enablement (SWE) suite of standards [presentation PDF file](Karl Benedict)
  • Pomegranate as an example of webification (w10n) that enables easier access of remote science data in RESTful way ppt (Xing)

Climate Modeling

   * Focus: The use of climate models
   * Workshop lead: Luca Cinquini, Auroop Ganguly, and Raju Vatsavai

Part 1: The Earth System Grid Federation: Building a Global Distributed Infrastructure for Climate Change Research

Presenter: Luca Cinquini

Abstract:

The Earth System Grid (ESG) project is building a global infrastructure to enable the scientific community with unprecedented access to massive, distributed, heterogenous datasets for climate change research. We will describe the ESG software architecture and demonstrate some of its federation functionality such as single sign on across gateways, metadata exchange, and distributed data access. We will present a status update on the most current ESG activities, such as: the coordination and preparation for the CMIP5/IPCC-AR5 archive; the expansion of the ESG federation to other federal agencies such as NASA and NOAA and international partners such as BADC and DKRZ; and the inclusion of observational datasets alongside the CMIP5 model output. Finally, we will outline some future lines of development within ESG, such as the movement towards an open source model, a shift in the architectural paradigm, and the integration with middleware for scientific analysis such as the Climate Data Exchange (CDX) services and toolkit.


Part 2: A "new kind" of knowledge discovery: Case study on climate change

Presenter: Auroop Ganguly

Abstract:

A new kind of knowledge discovery is presented which integrates interdisciplinary computational data sciences, process models and decision sciences to provide actionable predictive insights on urgent societal priorities in multidisciplinary settings. The focus is on extremes, nonlinear processes, rare events and change, with a translation to uncertainty and risk assessments. The need for innovative mathematical and computational solutions is motivated along with the ability to discover insights from multisource and heterogeneous data. While there is a need to work with massive data volumes, the data generation processes can be noisy and nonlinear, and the data relevant for analysis may be short compared to the length required for analysis. A case study is presented for knowledge discovery in climate, with a particular emphasis on how data-guided insights can complement physics-based computational models to generate credible predictive insights about climate extremes and regional change along with an assessment of the reliability of the climate projections.


Part 3: Large Scale Remote Sensing Data Mining

Presenter: Raju Vatsavai

Abstract:

Biomass monitoring over large geographic regions using remote sensing poses several challenges. Existing change detection techniques are not adequate or scalable for continuous monitoring. On the other hand characterizing changes requires accurate classification of remote sensing images. Supervised classification over large geographic regions poses the following challenges: i) inadequate ground truth data, ii) aggregate challenges, iii) spatial homogeneity, and iv) spatial heterogeneity. In this presentation, we discuss recent advances in spatiotemporal data mining, especially the techniques that exploit the subtle multidimensional signals through the joint use of high temporal resolution (MODIS) data and moderate- and fine-spatial resolution (AWiFS) satellite images for extracting multi-temporal biomass change information, including crop types and their conditions. Specifically we discuss Gaussian Process (GP) based classification and change detection techniques, semi-supervised and sub-class classification techniques and their spatial extensions. In addition, we will discuss computational challenges in scaling these algorithms for large geographic regions and provide recent results on multi-core and many core architectures.

Drupal in a Day

   * Workshop lead: Sunil Movva, Jerry Pan and Giri Palanisamy
  

Abstract

The "Drupal in a Day – Part 1" workshop will cover the basics of the Drupal content management system. It will cover the terminology and fundamental concepts of Drupal. You will learn how to setup and manage a Drupal based website, and you will do it on your laptop. Topics include:

1) What is Drupal and why you care

2) Setting up a Drupal website

3) Building a Drupal website, using Drupal administration user interface

4) Administering a Drupal site

5) Drupal Theme Concepts

The “Drupal in a Day – Part 2” workshop will walk you through extending Drupal’s core functionality with contributed modules. You will learn how to create custom content types with CCK and later will be introduced to the Drupal architecture and APIs essential for custom module development. Topics include:

1) Contributed modules

2) Creating custom content types

3) A brief introduction to module development


Intended Audience: This workshop are geared toward all audience, including data managers, scientists, and developers. No prior Drupal knowledge is required. Programming for Drupal is covered as an advanced topic, the rest of the workshop does not invGolve programming.

Workshop Category: Working meeting: overview and tutorials with hands on participation.

Service & Data Casting

   * Workshop lead:  Brian Wilson
   * Short presentations/demos and working discussion

Abstract:

Discussion of how to advertise web services, datasets, and data granules using service, dataset, and granule casts (RSS or Atom feeds). Brian will present on service casting, Ruth Duerr will demo some dataset casting, and Andy Bingham will discuss JPL's Datacasting efforts. Multiple groups are doing this kind of casting, so we will discuss intended uses, how to agree on these formats (use of georss and other metadata tags), and coordinate efforts to minimize redundant work.

Intended audience:

Anyone interested in publishing lightweight, searchable metadata for services and datasets in these 'casting' formats.

Cloud Computing

Workshop Lead: Dr. Chaowei Phil Yang

Cloud computing is a new computing paradigm that has the advantage of using computing as utility and exemplified through Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). Federal agencies are asked to report how they are using cloud and why they are in the next years. This workshop will introduce the concept of cloud computing and using an Earth science application as example to illustrate how cloud computing can be leveraged for Earth sciences.

OODT

   * Workshop lead:  Chris Mattmann
   * URL:  http://oodt.jpl.nasa.gov

Abstract:

The Object Oriented Data Technology (OODT) middleware is an independent set of reusable software components that can be deployed, tailored and configured to build out scalable data management systems. Complete with data curation/ingestion, processing and dissemination functionality, along with the ability to be deployed onto heterogeneous hardware environments including grids and clouds, OODT has been heavily deployed over the last 10 years at a number of government agencies (NASA, NIH, DoD), universities, and other institutions across the world.

OODT is the flagship technology that powers the NASA Planetary Data System (PDS), the NCI Early Detection Research Network (EDRN) project, and data processing systems for current NASA Earth science missions including OCO/ACOS, NPP Sounder PEATE, and the Soil Moisture Active Passive (SMAP) mission. In addition, OODT is being used in Earth science research projects, including the ACCESS07 Virtual Oceanographic Data Center (VODC) project, the ACCESS09 funded Coastal Marine Discovery System (CDMS) project, and the 2009-funded JPL-led Climate Data eXchange (CDX) project, an effort to share NASA observational data with the Earth System Grid.

I’ll cover the core components of OODT, its architecture, implementation, and recent movement to the Apache Software Foundation (ASF). OODT is the first NASA project to be hosted at Apache. I’ll discuss our efforts to pave the way in making NASA a vibrant member of the open source community.

Metadata Standards for GEOSS

   * Workshop lead:  Phil Yang

Abstract:

Metadata is critical in archiving, searching, finding, locating, and utilizing geospatial resources for scientific research. We will introduce metadata standards and their value in building the GEOSS clearinghouse. The clearinghouse utilizes FGDC, ISO, and other community metadata standards and encodes in various schema using XML for interoperability among registry and catalogs, including the GEOSS “Component and Service Registry”, Geospatial One Stop, INSPIRE, Geoconnect and other catalogs. The workshop will also introduce problems encountered in metadata use, such as uniqueness and lack of important items for better metadata utilization. This research, development, and Technical Workshop is sponsored by FGDC CAP program and NASA spatial interoperability program.