Difference between revisions of "Earth Science Data Analytics/2016-8-18 Telecon"

From Earth Science Information Partners (ESIP)
Line 12: Line 12:
 
1.  Next steps for ESDA Cluster
 
1.  Next steps for ESDA Cluster
  
4.  Open Mic – What else should we be addressing?
+
2.  Open Mic – What else should we be addressing?
  
  
Line 26: Line 26:
  
  
The topic, where can ESDA Cluster go from here, the following ideas were floated, consistent with our goals:  
+
For the topic, where can ESDA Cluster go from here, the following ideas were floated, consistent with our goals:  
  
Validate our work
+
- Validate our work
  
Prototype
+
- Prototype
  
 
Have a sandbox for using the tools
 
Have a sandbox for using the tools
Line 43: Line 43:
  
 
Move from tabular data to data visualization
 
Move from tabular data to data visualization
 +
  
 
   
 
   
Line 51: Line 52:
 
Current EPA/ORD/Earth science related work - Infrastructure, Analytics
 
Current EPA/ORD/Earth science related work - Infrastructure, Analytics
  
Intra-agency work/potential collaborations
+
Intra-agency work/potential collaborationsEPA/CDC ecological niche modeling of vector borne diseases potential project;  NASA/NOAA/USGS/EPA CyAN projectEPA/Chesbay Conservancy automated visual recognition projects
  EPA/CDC ecological niche` modeling of vector borne diseases potential
+
 
  NASA/NOAA/USGS/EPA CyAN project
+
Tools/techniques discussion/update
    EPA/Chesbay Conservancy automated visual recognition
 
  
Tools/techniques discussion/update? (could speak to the difference between machine learning/big data and  traditional statics/statistical learning?)
 
 
 
   
 
   
 
Other moving forward thoughts:
 
Other moving forward thoughts:
 +
 
-          What tools /techniques can we address first?  What problem to solve
 
-          What tools /techniques can we address first?  What problem to solve
 +
 
-          Invite and drill into projects already utilizing data analytics (e.g.,EPA Projects)
 
-          Invite and drill into projects already utilizing data analytics (e.g.,EPA Projects)
 
   
 
   
Line 68: Line 68:
 
   
 
   
 
2A. Joan:  Look at what other ESIP Clusters are doing that can be helpful.  For example, talk to Dave Jones, Disaster Cluster, about the data analytics he is utilizing.
 
2A. Joan:  Look at what other ESIP Clusters are doing that can be helpful.  For example, talk to Dave Jones, Disaster Cluster, about the data analytics he is utilizing.
 +
 
       Shea: Brought up Chris Mattman’s SciSpark talk.
 
       Shea: Brought up Chris Mattman’s SciSpark talk.
 +
 
       Can we be the conduit between ESIP ESDA tool interests (e.g., SciSPark) and heterogeneous data innsue (e.g., Disaster Data).  This may be a good prototype that might experimentally benefit both groups, and be exemplary for further such prototyping.
 
       Can we be the conduit between ESIP ESDA tool interests (e.g., SciSPark) and heterogeneous data innsue (e.g., Disaster Data).  This may be a good prototype that might experimentally benefit both groups, and be exemplary for further such prototyping.
 
   
 
   
Line 75: Line 77:
 
Also, let’s provde a activity slide for Annie for SciDataCon  (Steve can do this)
 
Also, let’s provde a activity slide for Annie for SciDataCon  (Steve can do this)
 
   
 
   
 
'''Discussion led to Action Item #1, below.'''
 
 
 
'''To Do List:'''
 
 
Done:
 
 
1.  Finalize ESDA Definition and Goal categories
 
 
2.  Write letter to ESIP Executive Committee proposing that the ESDA Definitions and Goal categories be ESIP approved
 
 
3.  Characterize use cases by Goal categories and other analytics driving considerations
 
 
4.  Derive requirements from #3
 
 
Underway:
 
 
5.  Survey existing data analytics tools/techniques
 
 
6.  Write our paper describing ... all the above
 
 
 
  
 
===Next Meeting:===
 
===Next Meeting:===

Revision as of 23:03, September 12, 2016

ESDA Telecon notes – 8/18/16

Known Attendees:

ESIP Hosts (Annie Burgess), Steve Kempler, Emily Northup, Beth Huffer, Byron Peters, Lindsay Barbieri, Chung-Lin Shie, Shea Caspersen, Annie Burgess, Tiffany Mathews, Angela Li, Robert Downs, Joan Aron, Tripp Corbett


Agenda:

Agenda

1. Next steps for ESDA Cluster

2. Open Mic – What else should we be addressing?


Presentations:

None


Notes:

Thank you all for attending and participating in our telecon


For the topic, where can ESDA Cluster go from here, the following ideas were floated, consistent with our goals:

- Validate our work

- Prototype

Have a sandbox for using the tools

List DA projects that people are working on

List potential problems that DA could be applied to

List challenges to doing more DA

List potential solutions to the challenges

Move from tabular data to data visualization


We can also become more involved with interagency work:

Intro/background related to Earth system science

Current EPA/ORD/Earth science related work - Infrastructure, Analytics

Intra-agency work/potential collaborations: EPA/CDC ecological niche modeling of vector borne diseases potential project; NASA/NOAA/USGS/EPA CyAN project; EPA/Chesbay Conservancy automated visual recognition projects

Tools/techniques discussion/update


Other moving forward thoughts:

- What tools /techniques can we address first? What problem to solve

- Invite and drill into projects already utilizing data analytics (e.g.,EPA Projects)

1. Tiffany: Focus on what individuals are working on that use particular ESDA techniques. Should be Use Case driven. Bring light to tool/technique via use case. Would need to choose candidates

2. Protoype: Utilize a ESDA tool to solve a data problem. Could have breakout groups to look at different tools/technique

2A. Joan: Look at what other ESIP Clusters are doing that can be helpful. For example, talk to Dave Jones, Disaster Cluster, about the data analytics he is utilizing.

     Shea: Brought up Chris Mattman’s SciSpark talk.
     Can we be the conduit between ESIP ESDA tool interests (e.g., SciSPark) and heterogeneous data innsue (e.g., Disaster Data).  This may be a good prototype that might experimentally benefit both groups, and be exemplary for further such prototyping.

Also, let’s take the opportunity to entice others interested in ESDA technologies

Also, let’s provde a activity slide for Annie for SciDataCon (Steve can do this)


Next Meeting:

September 15, 2016


Agenda:

1. Next Steps

2. Open Mic – What else should we be addressing?


Actions:

1. All: From the Techniques/Goals/Types matrix (https://docs.google.com/spreadsheets/d/1Xg4zYqAqrfu6NMdQtTYJn50J8kXpjpQBC0exX3XYOVQ/edit#gid=0)...

Go down ESDA Type columns entitled: Data Preparation, Data Reduction, and Data Analysis, and delete the techniques (starting on row 16) that can NOT be applied to that column ESDA type. Thus, the techniques that remain, are those that CAN be applied to that column ESDA type. If you are not sure what the technique is, please see technique descriptions at the Google Spreadsheet: https://docs.google.com/spreadsheets/d/1Xv8qySG4k6p8Y3rOYLonWlahwKPT86MCuJoIT3wske8/edit#gid=0 starting on line 33.

Please ask for permission to edit. So that we don't edit over each other, please download a copy to edit, and send back to Steve, or ask Steve to send you a copy to edit.

Please spend a little time and provide back by Friday, June 24. That would be great.

Step 2 would be mapping techniques to goals. More to come on this.


2. All - For our Cluster face to face in Durham: Please identify one or two multi-data researchers who would be willing to provide insights into their experiences and needs for accessing and preparing data for co-analysis of heterogeneous data.