Earth Science Data Analytics/2015-6-18 Telecon
ESDA Telecon notes – 6/18/15
ESIP Host (Annie Burgess), Steve Kempler, Chung-lin Shie, Tiffany Mathews, Brand Niemann, Joan Aron, Rob Casey, Ward Fleir
1. ESDA Use Cases
2. Summer Meeting Planning – 2 Sessions scheduled:
- Teaching Science Data Analytics Skills, and the Earth Science Data Scientist (http://commons.esipfed.org/node/7999)
- The Need for Earth Science Data Analytics to Facilitate Community Resilience (and other applications) (http://commons.esipfed.org/node/7998)
3. Open Mic
None, this time.
Use Case Information: https://docs.google.com/document/d/1U1mAt4ZjJqXeNmtRoE4VbI1nBgS1v7DzeHib_7mzOF8/edit
Thank you all for attending.
After a short discussion of Data Analytics types we have been addressing, we concluded, based on attempting to associate use cases with types, that: 1. We may need to rethink whether the data analytics types actually apply to Earth science data analytics, and; 2. The types we have been focusing on appear to be at a higher level than what our use cases can associate with. This is an area needing additional study, and with additional use cases, may become clearer.
We then discussed our 2 ESIP ESDA sessions.
1. Teaching Science Data Analytics Skills, and the Earth Science Data Scientist (http://commons.esipfed.org/node/7999)
We will have 4 speakers, who will provide their experiences in being, or needing, a Data Scientist in their work. The goal of this session is to discuss and extract real project data scientist/analytics experience needs, initiated by presentation and discussed by session participants. Of special interest is bringing together people who have needs for data scientists (data analytics) and will be able to articulate those needs by the end of the session, and/or; stir ideas: for the use of data analytics in their research or to build tools/services for others.
Wade Bishop, School of Information Sciences, University of Tennessee
Peter Fox, Earth & Environmental Sciences, Tetherless World Constellation, Rensselaer Polytechnic Institute
Lewis McGibbney, Computer Science for Data Intensive Systems Group, Jet Propulsion Laboratory
Karen Stocks, Director, Geological Data Center, Scripps Institution of Oceanography
2. The Need for Earth Science Data Analytics to Facilitate Community Resilience (and other applications) (http://commons.esipfed.org/node/7998)
This will be an open forum idea sharing discussion session. This cluster session will review our current work (for new participants), followed by discussion on the extent of social, economic, and environmental issues, as well as science research, in which the advancement of Earth science data analytics have had an impact. The goal of this discussion is to gain sufficient information to categorize how Earth science data analytics has come to be used in our society, and identify use cases that exemplify this
Today, we discussed: - What should our summer session goals be and how do we get there?
The following goals were identified:
1. Identify use cases that address societal issues, specifically
2. Identify datasets (Earth science and otherwise) needed to address the use case issues
3. Identify techniques that may be applied to gleaning information out of the data targeted at addressing the issues
What we should keep in the back of our mind is how we can show the significance of Earth science data analytics in addressing societal issues.
The telecon ended with a reminder by Annie that presenting a ESDA cluster poster at the summer meeting would be good. Describing the activities, speakers, and things we learned would be a good idea. We'll put a poster together based on information we shared through our website.
After our telecon Rob Casey shared a very insightful e-mail regarding ESDA's relationship to community resilience. (Thanks Rob) With permission, I am providing Rob's insights here, and feel it can stimulate further discussion this in July (unfortunately, Rob won't be able to make the meeting).
I think one of the first things to consider, which can help answer the 'why' of ESDA, is what does it mean to benefit society? What does society need from science? What is community resilience?
A community is resilient if it can effectively respond to unexpected events.
A community is resilient if it can prepare or engineer for dangerous eventualities.
A community is resilient if it can avert a dangerous event through preemptive action.
The state of affairs with science as applied to benefitting society has been observation of past data to explain phenomena and detection systems that can serve as warning measures to protect a population from a danger in progress.
ESDA can serve to take us beyond this state of affairs at a number of levels:
- Gathering of a larger variety of datasets to apply correlations and identify possible precursors to adverse events -- this can improve early warning systems and enhance predictive algorithms to plot the course or effect of damaging events
- Gathering and computation of a large enough set of data at high resolution to create detailed and reliable simulations of adverse events to establish hazard probabilities -- this informs disaster preparedness planning as well as engineering preparation
- Establishing associations of possible cause and effect occurrences, and applying these to predictive models. Knowing causes of adverse effects can establish a target for mitigation or avoidance. Accurate prediction of future effects can motivate society to act more quickly and effectively to curtail the looming issue.
- Refining the quality of gathered data will improve its usefulness and accuracy for long-tail studies. Data preparation is a key ingredient to useful and meaningful analytics. A proper program of data governance and continual data improvement ensures that data is always available and always useful.
- Good analytics also requires good tools and good visuals. Can the right people see the information they need easily and readily. Is it presented in such a way that it is useful, meaningful, and comphrensible to scientists and decision makers? Is there a base of tool technology that allows for grassroots growth of data analysis with community contribution?
The theme to this session will need to apply what is possible with ESDA technology and techniques and bring it back to where society will benefit. Society needs to evolve from studying a problem to predicting and even preventing a problem to attain maximum resilience. So many of our ills that deal with nature and anthropogenic effects on nature can be readily listed and an agency studying it can be identified.
The following Data Analytics / Big Data related sessions are listed to occur at the AGU next December:
- Advanced Information Systems to Support Climate Projection Data Analysis
Gerald L Potter, Tsengdar J Lee, Dean Norman Williams, and Chris A Mattmann
- Big Data Analytics for Scientific Data
Emily Law, Michael M Little, Daniel J Crichton, and Padma A Yanamandra-Fisher
- Big Data in Earth Science – From Hype to Reality
Kwo-Sen Kuo, Rahul Ramachandran, Ben James Kingston Evans. and Mike M Little
- Big Data in the Geosciences: New Analytics Methods and Parallel Algorithms
Jitendra Kumar and Forrest M Hoffman
- Computing Big Earth Data
Michael M Little, Darren L. Smith, Piyush Mehrotra, and Daniel Duffy
- Geophysical Science Data Analytics Use Case Scenarios
Steven J Kempler, Robert R Downs, Tiffany Joi Mathews, and John S Hughes
- Man vs. Machine - Machine Learning and Cognitive Computing in the Earth Sciences
Jens F Klump, Xiaogang Ma, Jess Robertson and Peter A Fox
- New approaches for designing Big Data databases
David W Gallaher and Glenn Grant
- Partnerships and Big Data Facilities in a Big Data World
Kenneth S Casey and Danie Kinkade
- Towards a Career in Data Science: Pathways and Perspectives
Karen I Stocks, Lesley A Wyborn, Ruth Duerr, and Lynn Yarmey
Tuesday, July 14, 2015, 8:30 PST, at Asilomar, first thing Tuesday morning
Thursday, July 16, 2015, 10:30 PST, at Asilomar
All: Be at Asilomar