Difference between revisions of "Earth Science Data Analytics/2015-11-12 Telecon"

From Earth Science Information Partners (ESIP)
(Created page with "ESDA Telecon notes – 11/12/15 ===Known Attendees:=== ESIP Host (Annie Burgess), Steve Kempler, Tiffany Mathews, Sean Barberie, Beth Huffer, Chung-Lin Shie, Robert Downs, E...")
 
 
(2 intermediate revisions by the same user not shown)
Line 34: Line 34:
 
Well we covered Agenda Item #1 pretty well.  The hour consisted of an excellent discussion on the letter being prepared for the ExComm recommending the ESIP endorse the ESDA Earth science data analytics definition.  It is felt that having a clear ESDA definition will facilitate the development of ESDA techniques and tools that focus on Earth science.   
 
Well we covered Agenda Item #1 pretty well.  The hour consisted of an excellent discussion on the letter being prepared for the ExComm recommending the ESIP endorse the ESDA Earth science data analytics definition.  It is felt that having a clear ESDA definition will facilitate the development of ESDA techniques and tools that focus on Earth science.   
  
Today's discussion focused on improving the definition to its final form and editing the letter.  The current version of the letter will soon be posted and further discussed (finalized?) at the next ESDA telecon.  The ESDA definitions as follows:
+
Today's discussion focused on improving the definition to its final form, and editing the letter.  The current version of the letter will soon be posted and further discussed (finalized?) at the next ESDA telecon.  The ESDA definition is as follows:
  
'''Earth Science Data Analytics definition:
 
  
The process of examining, preparing, reducing, and analyzing large amounts of spatial (multi-dimensional), temporal, or spectral data using a variety of data types to uncover patterns, correlations and other information, to better understand our Earth.'''
+
'''Earth Science Data Analytics definition:'''
  
 +
'''The process of examining, preparing, reducing, and analyzing large amounts of spatial (multi-dimensional), temporal, or spectral data using a variety of data types to uncover patterns, correlations and other information, to better understand our Earth.'''
  
  
 +
The remainder of the time reviewed our to do list (see below), our Winter meeting session (see abstract that follows), and the following potential collaborations with other ESIP working groups (clusters,etc.):
  
  
Discussion began around finalizing the ESIP ESDA Definition and Goals for endorsement by the ESIP Federation.  At this point the process for gaining ESIP approval was described:
+
Emerging Big Data Technologies for Geoscience - We can share derived ESDA requirements and found technology gaps
  
* Bring a short white paper describing what is to be endorsed to the ESIP ExComm
+
Esip-disasters and Esip-infoquality - We can share use cases to determine what data analytics requirements may emerge
* This is followed by a 30 day review cycle,
 
* Questions, suggestions, recommendations, to update proposed endorsement, are provided, or an explanation for not endorsing.
 
* The endorsement is put up for vote
 
  
Steve will initiate the writing of the white paper.  And seeking writing and reviewing partners.
 
  
 +
'''WInter ESIP ESDAS Cluster Session Abstract: '''
  
After a few more minor tweaks, this ESIP cluster's definition for Earth Science Data Analytics is:
+
The Earth Science Data Analytics (ESDA) Cluster has made great strides in understanding the utilization of data analytics in Earth science, an area virtually untouched in the literature.  In achieiving its goal to support advancing science research that increasingly includes very large volumes of heterogeneous data, the ESDA Cluster has defined terms, documented use cases, and loosely identified tools and technologies that faciltate a better understanding of the needs of Earth science research.
  
'''The process of examining large amounts of spatial (3D), temporal, and/or spectral data of a variety of data types to uncover hidden patterns, unknown correlations and other useful information, involving one or more of the following:'''
 
* '''Data Preparation – Preparing heterogeneous data so that they can ‘play’ together'''
 
* '''Data Reduction – Smartly removing data that do not fit research criteria'''
 
* '''Data Analysis – Applying techniques/methods to derive results'''
 
  
 +
This cluster session will discuss and initate the work still to be done, including evaluating use cases, extracting data analytics requirements from use cases (this will be a major part of the discussion), survey exisiting data anlytics tools and techniques, and sharing derived ESDA requirements and found technology gaps with the ESIP group interested in 'Emerging Big Data Technologies for Geoscience'.
  
and the goals of Earth Science Data Analytics, in which such analytics can be categorized, include:
 
  
 
+
'''To Do List:'''
'''ESDA Goals (read: Earth science data analytics needed ...)'''
 
 
 
'''1    To calibrate data'''
 
 
 
'''2    To validate data (note it does not have to be via data intercomparison)'''
 
 
 
'''3    To assess data quality'''
 
 
 
'''4    To perform course data preparation (e.g., subsetting, data mining, transformations, recover data)'''
 
 
 
'''5    To intercompare data (i.e., any data intercomparison; Could be used to better define validation/quality)'''
 
 
 
'''6    To tease out information from data'''
 
 
 
'''7    To glean knowledge from data and information'''
 
 
 
'''8    To forecast/predict phenomena (i.e., Special kind of conclusion)'''
 
 
 
'''9    To derive conclusions (i.e., that do not easily fall into another type)'''
 
 
 
'''10  To derive new analytics tools'''
 
 
 
 
 
These will be the basis for the ESIP Federation definition and goals for Earth Science Data Analytics
 
 
 
 
 
During the telecon, Steve reviewed a 'to do' list to describe our road ahead, that included:
 
  
 
Done:
 
Done:
Line 165: Line 132:
 
===Actions:===
 
===Actions:===
  
Steve:  Initiate draft endorsement paper
+
Steve:  Update ESDA definition endorsement letter
  
 
Volunteers:  Review endorsement paper, when ready
 
Volunteers:  Review endorsement paper, when ready
  
 
All:  Think about process for matching use case requirements with capabilities of existing tools.
 
All:  Think about process for matching use case requirements with capabilities of existing tools.

Latest revision as of 16:12, November 13, 2015

ESDA Telecon notes – 11/12/15

Known Attendees:

ESIP Host (Annie Burgess), Steve Kempler, Tiffany Mathews, Sean Barberie, Beth Huffer, Chung-Lin Shie, Robert Downs, Ethan McMahon, Joan Aron


Agenda:

Agenda:

1. Finalize ESDA Analytics Definitions and Goals Statement to ESIP ExComm

2. Determining Analytics Tools/Techniques Requirements associated with Analytics Goals

3. Process for associating Analytics Tools/Techniques that can fulfill Requirements

4. Open Mic


Presentations:

None, this time.

Use Case Information: https://docs.google.com/document/d/1U1mAt4ZjJqXeNmtRoE4VbI1nBgS1v7DzeHib_7mzOF8/edit


Notes:

Thank you all for attending and participating in our telecon.


Well we covered Agenda Item #1 pretty well. The hour consisted of an excellent discussion on the letter being prepared for the ExComm recommending the ESIP endorse the ESDA Earth science data analytics definition. It is felt that having a clear ESDA definition will facilitate the development of ESDA techniques and tools that focus on Earth science.

Today's discussion focused on improving the definition to its final form, and editing the letter. The current version of the letter will soon be posted and further discussed (finalized?) at the next ESDA telecon. The ESDA definition is as follows:


Earth Science Data Analytics definition:

The process of examining, preparing, reducing, and analyzing large amounts of spatial (multi-dimensional), temporal, or spectral data using a variety of data types to uncover patterns, correlations and other information, to better understand our Earth.


The remainder of the time reviewed our to do list (see below), our Winter meeting session (see abstract that follows), and the following potential collaborations with other ESIP working groups (clusters,etc.):


Emerging Big Data Technologies for Geoscience - We can share derived ESDA requirements and found technology gaps

Esip-disasters and Esip-infoquality - We can share use cases to determine what data analytics requirements may emerge


WInter ESIP ESDAS Cluster Session Abstract:

The Earth Science Data Analytics (ESDA) Cluster has made great strides in understanding the utilization of data analytics in Earth science, an area virtually untouched in the literature. In achieiving its goal to support advancing science research that increasingly includes very large volumes of heterogeneous data, the ESDA Cluster has defined terms, documented use cases, and loosely identified tools and technologies that faciltate a better understanding of the needs of Earth science research.


This cluster session will discuss and initate the work still to be done, including evaluating use cases, extracting data analytics requirements from use cases (this will be a major part of the discussion), survey exisiting data anlytics tools and techniques, and sharing derived ESDA requirements and found technology gaps with the ESIP group interested in 'Emerging Big Data Technologies for Geoscience'.


To Do List:

Done:

1. Finalize ESDA Definition and Goal categories

Underway:

2. Write letter to ESIP Executive Committee proposing that the ESDA Definitions and Goal categories be ESIP approved

3. Acquire many more additional use cases

4. Characterize use cases by Goal categories and other analytics driving considerations

5. Derive requirements from #4

6. Survey existing data analytics tools/techniques

7. Write our paper describing ... all the above


Questions to think about:

What is the best way to record use cases, and associated requirements, and matching tools? A forum?



Going to AGU?

The following Data Analytics / Big Data related sessions are listed to occur at the AGU in December:

  • Advanced Information Systems to Support Climate Projection Data Analysis

Gerald L Potter, Tsengdar J Lee, Dean Norman Williams, and Chris A Mattmann

  • Big Data Analytics for Scientific Data

Emily Law, Michael M Little, Daniel J Crichton, and Padma A Yanamandra-Fisher

  • Big Data in Earth Science – From Hype to Reality

Kwo-Sen Kuo, Rahul Ramachandran, Ben James Kingston Evans. and Mike M Little

  • Big Data in the Geosciences: New Analytics Methods and Parallel Algorithms

Jitendra Kumar and Forrest M Hoffman

  • Computing Big Earth Data

Michael M Little, Darren L. Smith, Piyush Mehrotra, and Daniel Duffy

  • Geophysical Science Data Analytics Use Case Scenarios

Steven J Kempler, Robert R Downs, Tiffany Joi Mathews, and John S Hughes

  • Man vs. Machine - Machine Learning and Cognitive Computing in the Earth Sciences

Jens F Klump, Xiaogang Ma, Jess Robertson and Peter A Fox

  • New approaches for designing Big Data databases

David W Gallaher and Glenn Grant

  • Partnerships and Big Data Facilities in a Big Data World

Kenneth S Casey and Danie Kinkade

  • Towards a Career in Data Science: Pathways and Perspectives

Karen I Stocks, Lesley A Wyborn, Ruth Duerr, and Lynn Yarmey


Next Telecon:

Thursday, December 3, 2015, 3:00 EST


Agenda:

Among other things, finalize letter to ESIP Executive Committee for ESIP ESDA definition approval; Discuss process for matching use case requirements with capabilities of existing tools.

Actions:

Steve: Update ESDA definition endorsement letter

Volunteers: Review endorsement paper, when ready

All: Think about process for matching use case requirements with capabilities of existing tools.