Earth Science Data Analytics/2015-12-03 Telecon
ESDA Telecon notes – 12/03/15
Known Attendees:
ESIP Host (Erin Robinson), Steve Kempler, Sean Barberie, Chung-Lin Shie, Robert Downs, Ethan McMahon, Joan Aron, Thomas Hearty
Agenda:
Agenda
1. ESDA Analytics Definitions and Goals Statement – ready to go
2. Starting the gap analysis between ESDA requirements and available tools/techniques
3. Preparing for ESIP Meeting in January
4. See you at AGU (??)
Presentations:
None, this time.
Use Case Information: https://docs.google.com/document/d/1U1mAt4ZjJqXeNmtRoE4VbI1nBgS1v7DzeHib_7mzOF8/edit
Notes:
Thank you all for attending and participating in our telecon. Sorry about the video problems.
Excellent discussion… and thanks for the very relevant new information brought forth by Ethan.
ESDA donations and goals are ready to be presented to the ESIP ExComm. Erin explained that once we sen our 'letter' to them, they ail review it, put it out to the Assembly for comment for 30 days, and then put it to vote. Pretty exciting! No other organization has endorsed an Earth Science Data Analytics definition, although a few struggle to derive one.
Briefly discussing agenda item 3, we will have one ESIP Meeting cluster session that will contain, at this time: A presentation by Steve Ambrose (NASA NCCS) to discuss new tools developed by NCCS for utilizing cloud based data (we touched upon whether ESDA had the most appropriate audience for this presentation, Erin thought maybe the Cloud session would be better); And a Gap Analysis between ESDA requirements and available tools/techniques discussion.
The remainder of the telecon focused on preparing for the Gap Analysis discussion (agenda item 2).
Well we covered Agenda Item #1 pretty well. The hour consisted of an excellent discussion on the letter being prepared for the ExComm recommending the ESIP endorse the ESDA Earth science data analytics definition. It is felt that having a clear ESDA definition will facilitate the development of ESDA techniques and tools that focus on Earth science.
Today's discussion focused on improving the definition to its final form, and editing the letter. The current version of the letter will soon be posted and further discussed (finalized?) at the next ESDA telecon. The ESDA definition is as follows:
Earth Science Data Analytics definition:
The process of examining, preparing, reducing, and analyzing large amounts of spatial (multi-dimensional), temporal, or spectral data using a variety of data types to uncover patterns, correlations and other information, to better understand our Earth.
The remainder of the time reviewed our to do list (see below), our Winter meeting session (see abstract that follows), and the following potential collaborations with other ESIP working groups (clusters,etc.):
Emerging Big Data Technologies for Geoscience - We can share derived ESDA requirements and found technology gaps
Esip-disasters and Esip-infoquality - We can share use cases to determine what data analytics requirements may emerge
WInter ESIP ESDAS Cluster Session Abstract:
The Earth Science Data Analytics (ESDA) Cluster has made great strides in understanding the utilization of data analytics in Earth science, an area virtually untouched in the literature. In achieiving its goal to support advancing science research that increasingly includes very large volumes of heterogeneous data, the ESDA Cluster has defined terms, documented use cases, and loosely identified tools and technologies that faciltate a better understanding of the needs of Earth science research.
This cluster session will discuss and initate the work still to be done, including evaluating use cases, extracting data analytics requirements from use cases (this will be a major part of the discussion), survey exisiting data anlytics tools and techniques, and sharing derived ESDA requirements and found technology gaps with the ESIP group interested in 'Emerging Big Data Technologies for Geoscience'.
To Do List:
Done:
1. Finalize ESDA Definition and Goal categories
2. Write letter to ESIP Executive Committee proposing that the ESDA Definitions and Goal categories be ESIP approved
3. Characterize use cases by Goal categories and other analytics driving considerations
4. Derive requirements from #3
Underway:
5. Further validate requirements with (many) more additional use cases
6. Survey existing data analytics tools/techniques
7. Write our paper describing ... all the above
Questions to think about:
What is the best way to record use cases, and associated requirements, and matching tools? A forum?
Going to AGU?
The following Data Analytics / Big Data related sessions are listed to occur at the AGU in December:
- Advanced Information Systems to Support Climate Projection Data Analysis
Gerald L Potter, Tsengdar J Lee, Dean Norman Williams, and Chris A Mattmann
- Big Data Analytics for Scientific Data
Emily Law, Michael M Little, Daniel J Crichton, and Padma A Yanamandra-Fisher
- Big Data in Earth Science – From Hype to Reality
Kwo-Sen Kuo, Rahul Ramachandran, Ben James Kingston Evans. and Mike M Little
- Big Data in the Geosciences: New Analytics Methods and Parallel Algorithms
Jitendra Kumar and Forrest M Hoffman
- Computing Big Earth Data
Michael M Little, Darren L. Smith, Piyush Mehrotra, and Daniel Duffy
- Geophysical Science Data Analytics Use Case Scenarios
Steven J Kempler, Robert R Downs, Tiffany Joi Mathews, and John S Hughes
- Man vs. Machine - Machine Learning and Cognitive Computing in the Earth Sciences
Jens F Klump, Xiaogang Ma, Jess Robertson and Peter A Fox
- New approaches for designing Big Data databases
David W Gallaher and Glenn Grant
- Partnerships and Big Data Facilities in a Big Data World
Kenneth S Casey and Danie Kinkade
- Towards a Career in Data Science: Pathways and Perspectives
Karen I Stocks, Lesley A Wyborn, Ruth Duerr, and Lynn Yarmey
Next Telecon:
No telecon until late January… but we'll see you in Washington
Agenda:
Discuss Gap Analysis: Matching use case requirements with capabilities of existing tools.
Actions:
1. Steve, Joan, Ethan, Sean, Chung-Lin, Rob, Thomas:
a. Read paper provided by Ethan: http://www.boozallen.com/content/dam/boozallen/media/file/The-Field-Guide-to-Data-Science.pdf
b. Describe the ESDA tools/techniques we identified (More details to follow in e-mail)
c. Map techniques defined in the boozallen paper to our ESDA goals requirements, as appropriate (More details to follow in e-mail)
2. All other ESDA members: Help… please let Steve know if you can help us with Action #1