Difference between revisions of "Earth Science Data Analytics/2016-8-18 Telecon"

From Earth Science Information Partners (ESIP)
(Created page with "ESDA Telecon notes – 8/18/16 ===Known Attendees:=== ESIP Hosts (Annie Burgess), Steve Kempler, Emily Northup, Beth Huffer, Byron Peters, Lindsay Barbieri, Chung-Lin Shie, ...")
 
 
(4 intermediate revisions by the same user not shown)
Line 12: Line 12:
 
1.  Next steps for ESDA Cluster
 
1.  Next steps for ESDA Cluster
  
4.  Open Mic – What else should we be addressing?
+
2.  Open Mic – What else should we be addressing?
  
  
Line 26: Line 26:
  
  
The topic, where can ESDA Cluster go from here, the following ideas were floated, consistent with our goals:  
+
For the topic, where can ESDA Cluster go from here, the following ideas were floated, consistent with our goals:  
  
Validate our work
+
- Validate our work
Prototype
 
Have a sandbox for using the tools
 
List DA projects that people are working on
 
List potential problems that DA could be applied to
 
List challenges to doing more DA
 
List potential solutions to the challenges
 
Move from tabular data to data visualization
 
 
We can also become more involved with interagency work:
 
  
Intro/background related to Earth system science
+
- Prototype
Current EPA/ORD/Earth science related work
 
  Infrastructure
 
  Analytics
 
Intra-agency work/potential collaborations
 
  EPA/CDC ecological niche` modeling of vector borne diseases potential
 
  NASA/NOAA/USGS/EPA CyAN project
 
    EPA/Chesbay Conservancy automated visual recognition 
 
Tools/techniques discussion/update? (could speak to the difference between machine learning/big data and  traditional statics/statistical learning?)
 
 
 
Other moving thoughts:
 
-          What tools /techniques can we address first?  What problem to solve
 
-          Invite and drill into projects already utilizing data analytics (e.g.,EPA Projects)
 
 
1.      Tiffany:  Focus on what individuals are working on that use particular ESDA techniques.  Should be Use Case driven.  Bring light to tool/technique via use case.  Would need to choose candidates
 
 
2.      Protoype:  Utilize a ESDA tool to solve a data problem.  Could have breakout groups to look at different tools/technique
 
 
2A. Joan:  Look at what other ESIP Clusters are doing that can be helpful.  For example, talk to Dave Jones, Disaster Cluster, about the data analytics he is utilizing.
 
      Shea: Brought up Chris Mattman’s SciSpark talk.
 
      Can we be the conduit between ESIP ESDA tool interests (e.g., SciSPark) and heterogeneous data innsue (e.g., Disaster Data).  This may be a good prototype that might experimentally benefit both groups, and be exemplary for further such prototyping.
 
 
Also, let’s take the opportunity to entice others interested in ESDA technologies
 
 
Also, let’s provde a activity slide for Annie for SciDataCon  (Steve can do this)
 
 
  
'''Discussion led to Action Item #1, below.'''
+
- Have a sandbox for using the tools
  
 +
- List DA projects that people are working on
  
'''To Do List:'''
+
- List potential problems that DA could be applied to
  
Done:
+
- List challenges to doing more DA
  
1.  Finalize ESDA Definition and Goal categories
+
- List potential solutions to the challenges
  
2.  Write letter to ESIP Executive Committee proposing that the ESDA Definitions and Goal categories be ESIP approved
+
- Move from tabular data to data visualization
  
3.  Characterize use cases by Goal categories and other analytics driving considerations
 
  
4. Derive requirements from #3
+
   
 +
We can also become more involved with interagency work:
  
Underway:
+
- Intro/background related to Earth system science
  
5.  Survey existing data analytics tools/techniques
+
- Current EPA/ORD/Earth science related work - Infrastructure, Analytics
  
6. Write our paper describing ... all the above
+
- Intra-agency work/potential collaborations: EPA/CDC ecological niche modeling of vector borne diseases potential project;  NASA/NOAA/USGS/EPA CyAN project;  EPA/Chesbay Conservancy automated visual recognition projects
  
 +
- Tools/techniques discussion/update
  
 +
 +
Other moving forward thoughts:
  
===Next Meeting:===
+
-          What tools /techniques can we address first?  What problem to solve
  
'''Durham'''
+
-          Invite and drill into projects already utilizing data analytics (e.g.,EPA Projects)
 +
  
 +
1.      Tiffany:  Focus on what individuals are working on that use particular ESDA techniques.  Should be Use Case driven.  Bring light to tool/technique via use case.  Would need to choose candidates
 +
 +
2.      Protoype:  Utilize a ESDA tool to solve a data problem.  Could have breakout groups to look at different tools/technique
 +
 +
2A. Joan:  Look at what other ESIP Clusters are doing that can be helpful.  For example, talk to Dave Jones, Disaster Cluster, about the data analytics he is utilizing.
  
Agenda:
+
3.  Shea: Brought up Chris Mattman’s SciSpark talk.
  
1Observations in learning to be a Data Scientist - Bar
+
Can we be the conduit between ESIP ESDA tool interests (e.g., SciSPark) and heterogeneous data innsue (e.g., Disaster Data).  This may be a good prototype that might experimentally benefit both groups, and be exemplary for further such prototyping.
 +
   
 +
Also, let’s take the opportunity to entice others interested in ESDA technologies
 +
 +
Also, let’s provde a activity slide for Annie for SciDataCon  (Steve can do this)
 +
  
2.  Techniques mapped to ESDA types and goals – Status
+
===Next Meeting:===
  
3.  Tools mapped to techniques, and gaps - Status
 
  
4.  How can we best validate our work with ESDA users?
+
September 15, 2016
  
5.  Next Steps
 
  
6.  Open Mic – What else should we be addressing?
+
Agenda:   
 
 
 
 
 
 
===Actions:===
 
 
 
1. All: From the Techniques/Goals/Types matrix (https://docs.google.com/spreadsheets/d/1Xg4zYqAqrfu6NMdQtTYJn50J8kXpjpQBC0exX3XYOVQ/edit#gid=0)...
 
 
 
Go down ESDA Type columns entitledData Preparation, Data Reduction, and Data Analysis, and delete the techniques (starting on row 16) that can NOT be applied to that column ESDA type.  Thus, the techniques that remain, are those that CAN be applied to that column ESDA type.  If you are not sure what the technique is, please see technique descriptions at the Google Spreadsheet:  https://docs.google.com/spreadsheets/d/1Xv8qySG4k6p8Y3rOYLonWlahwKPT86MCuJoIT3wske8/edit#gid=0    starting on line 33.
 
 
 
Please ask for permission to edit.  So that we don't edit over each other, please download a copy to edit, and send back to Steve, or ask Steve to send you a copy to edit.
 
 
 
Please spend a little time and provide back by Friday, June 24.  That would be great.
 
 
 
Step 2 would be mapping techniques to goals. More to come on this.
 
  
 +
1.  Next Steps
  
2. All - For our Cluster face to face in Durham: Please identify one or two multi-data researchers who would be willing to provide insights into their experiences and needs for accessing and preparing data for co-analysis of heterogeneous data.
+
2.  Open Mic – What else should we be addressing?

Latest revision as of 09:39, October 18, 2016

ESDA Telecon notes – 8/18/16

Known Attendees:

ESIP Hosts (Annie Burgess), Steve Kempler, Emily Northup, Beth Huffer, Byron Peters, Lindsay Barbieri, Chung-Lin Shie, Shea Caspersen, Annie Burgess, Tiffany Mathews, Angela Li, Robert Downs, Joan Aron, Tripp Corbett


Agenda:

Agenda

1. Next steps for ESDA Cluster

2. Open Mic – What else should we be addressing?


Presentations:

None


Notes:

Thank you all for attending and participating in our telecon


For the topic, where can ESDA Cluster go from here, the following ideas were floated, consistent with our goals:

- Validate our work

- Prototype

- Have a sandbox for using the tools

- List DA projects that people are working on

- List potential problems that DA could be applied to

- List challenges to doing more DA

- List potential solutions to the challenges

- Move from tabular data to data visualization


We can also become more involved with interagency work:

- Intro/background related to Earth system science

- Current EPA/ORD/Earth science related work - Infrastructure, Analytics

- Intra-agency work/potential collaborations: EPA/CDC ecological niche modeling of vector borne diseases potential project; NASA/NOAA/USGS/EPA CyAN project; EPA/Chesbay Conservancy automated visual recognition projects

- Tools/techniques discussion/update


Other moving forward thoughts:

- What tools /techniques can we address first? What problem to solve

- Invite and drill into projects already utilizing data analytics (e.g.,EPA Projects)


1. Tiffany: Focus on what individuals are working on that use particular ESDA techniques. Should be Use Case driven. Bring light to tool/technique via use case. Would need to choose candidates

2. Protoype: Utilize a ESDA tool to solve a data problem. Could have breakout groups to look at different tools/technique

2A. Joan: Look at what other ESIP Clusters are doing that can be helpful. For example, talk to Dave Jones, Disaster Cluster, about the data analytics he is utilizing.

3. Shea: Brought up Chris Mattman’s SciSpark talk.

Can we be the conduit between ESIP ESDA tool interests (e.g., SciSPark) and heterogeneous data innsue (e.g., Disaster Data). This may be a good prototype that might experimentally benefit both groups, and be exemplary for further such prototyping.

Also, let’s take the opportunity to entice others interested in ESDA technologies

Also, let’s provde a activity slide for Annie for SciDataCon (Steve can do this)


Next Meeting:

September 15, 2016


Agenda:

1. Next Steps

2. Open Mic – What else should we be addressing?