Earth Science Data Analytics/2014-03-20 Telecon

From Earth Science Information Partners (ESIP)

ESDA Telecom notes – 3/20/14

Known Attendees:

  1. ESIP Host (Carol or Erin)
  2. Bamshad Mobasher
  3. Steve Kempler
  4. Seung Hee Kim
  5. John Schnase
  6. Joan Aron
  7. Helen Conover
  8. Robert Downs
  9. Ari Posner
  10. Emily Law
  11. fritz vanwijngaarden
  12. chung-lin shie
  13. Jennifer Davis
  14. Rama
  15. Bruce Caron
  16. Brand Niemann
  17. Anjanette Hawk
  18. Rudy Husar
  19. Thomas Huang
  20. Deborah Smith
  21. Smiley
  22. John Farley
  23. Sara Graves
  24. Beth Huffer

Agenda:

1 Topics to better understand, so far:

- Dr. Brand Niemann, Director and Senior Data Scientist, Semantic Community: Sorting out Data Science and Data Analytics

2 Two Guest Speakers – What are people doing with Data Analytics

- Dr. John Schnase, NASA/GSFC: Hands on Experience: Big Data Challenges

- Prof. Bamshad Mobasher, Professor of Data Analytics, DePaul Univeristy: Data Analytics Masters Degree Overview

3 ESDA Activities Discussion: These are solid activities that have been suggested so far:

- Compile use cases (include producer/supplier and data user analytics utilization)

- Compile analytics tools (internal and external to ESIP)

- Do gap analysis


Referenced Material:


Presentations:


Notes:

More than 40 people attended this telecom. Interest is high. As in any start-up group addressing an area with extensive components that can be addressed in various ways, we too will coalesce in one or maybe more directions.

The purpose of this telecom was to initiate discussion on Earth Science Data Analytics and the Data Scientist to start the coalescing process that would result in ESIP contributions to, ultimately, facilitate the advancement of Earth science.

The following show the process commencing and several potential actionable ideas that have so far come forth. Please feel free to add additional comments to the meeting notes or send me an e-mail.

External Activities:

  • We should look at inventory activities pursued outside ESIP (Emily L)
  • John Schnase (GSFC) has relevant activities related to ‘Climate Analytics-as-a-Service’ (Chris L)
  • We should also look into inviting individuals from other groups (e.g., CODATA, NSF, IEEE) (Bob C, who will help look for/provide points of contact)

Information Sharing:

Ideas (potential direction) and Other Notes:

  • Idea: What does analytics mean in Earth science. Currently, tools are crude. We can we help users find what they are looking for (Chris L)
  • Idea: We can define the analytics toolset (focusing on Earth science) (Sara G)?
  • Idea: We can assemble end-to-end team(s) that together address various aspects of data analytics (and, more broadly, Data Science. This would also surface gaps in our expertise. (Bob C)
  • Note: Data Science is much bigger than analytics (Sara, others). Thus, let’s not treat them the same. (We can address both topics, but not as one topic)

RDA Highlights (thanks to Rahul)

  • Idea: We can provide ESIP Earth science expertise to support RDA activities (e.g.,use cases) (Sara G, Nancy H)
  • Idea: We can identify cross domain commonalities (Emily L)

NIST highlights (thanks to Wo) – See presentation

  • Idea: We can better understand and provide potential ESIP expertise to NIST activities

Post Telecom Comments:

  • Idea: Data Supplier vs. Data User perspectives. We can surface/organize the analytics needs and use cases from both perspectives (as noted below, related Bob’s idea above)

Comment 1 (from Rudy H):

  • Another dimension of delineating Data Scientist and Data Analytics is along the Data Creator/Provider < --- > Data End User axis. -- The perspectives and the needs of Data Science and Data Analytics are very different where you are along that axis. -- Typically a real gap exists between the two perspectives,

Comment 2 (from Joan A):

  • My main comment is that the telecom tended to focus more on the suppliers of tools. This should be complemented by attention to the demand side. I am thinking of environmental monitoring and protection decision-makers who need interaction with the suppliers of the technologies. ESIP has a niche in contributing to this understanding. Bob Chen's comments about examining the whole process and comments about use cases fit in here. I have a particular interest in the perspective as a user in how data analytics and sharing can support better decisions linking environmental protection and public health.
  • Idea: We can consider focusing on the collection of case studies where organizations have implemented big data solutions to problems, carried out analytics, quality assurance, and have allowed policy makers to make informed decisions based on the end products of data science. From this body of work, which can highlight both successes and failures, I think that the group can begin to form recommendations on how organizations should proceed in data science based on their particular goals. It can also serve as a bed of research for data scientists and IT staff to consider alternatives to their own approaches. (Rob C)


Next Telecon:

  • Targeting: March 20, 3:00 EST
  • Looking for help setting the agenda (contact Steve) drawing from ‘ideas’ provided above – Eric K?, Brand N? (help address Data Scientist related activities), Emily L? Others?
  • Invite 2 guest speakers to discuss their Analytics activities