Difference between revisions of "Earth Science Data Analytics/2014-05-22 Telecon"

From Earth Science Information Partners (ESIP)
 
(12 intermediate revisions by the same user not shown)
Line 1: Line 1:
  
SDA Telecom notes – 5/22/14
+
ESDA Telecom notes – 5/22/14
  
 
===Known Attendees:===
 
===Known Attendees:===
  
To be Provided
+
ESIP Host (Erin). Tiffany Mathews, Robert Casey, stevek, chung-lin shie, Sara Graves, Joan Aron, Emily Law, Beth Huffer, Bob Chen, H. Joe Lee, Brand Niemann, Robert Downs, suhung shen
  
 
===Agenda:===
 
===Agenda:===
Line 11: Line 11:
  
  
2  – Tiffany Matthews – Describe/Demonstrate UV CAT and ClimatePipes visualization analytics tools
+
2  – Tiffany Matthews – Describe/Demonstrate UV CDAT and ClimatePipes visualization analytics tools
  
Refhttps://www.youtube.com/watch?v=BFN0RzN1hSE&feature=em-share_video_user
+
UV CAThttp://uv-cdat.llnl.gov/
  
  
Line 37: Line 37:
 
===Notes:===
 
===Notes:===
  
From March 20th Telecom:
 
  
Today, from Joan, Rudy, and Tiffany, we received three excellent, insightful presentations regarding the need of data analytics from a user perspective, and a data discovery perspective, as well as useful tools that can help the data userPlease give them a look via the links above.
+
Today, Tiffany provided a demonstration of UV CVAT, and described ClimatyePipes, two visualization analytics tools.  Tiffany followed with a continuation of last month's discussion on different types of data analyticsSteve followed, showing the ESDA Use Case Gathering website, and Data Analytics Tools/Techniques Inventory website.
  
  
'''The ESDA Discussion Forum is open for topic requests, ideas, references, and continued telecom discussion -
+
From Tiffany's UV CVAT and ClimatePipes demonstration/discussion<br />
http://wiki.esipfed.org/index.php/Earth_Science_Data_Analytics/Discussion_Forum'''
 
  
 +
'''UV-CDAT http://uv-cdat.llnl.gov/ Description''': UV-CDAT brings together two active projects -- Ultrascale Visualization Climate Data Analysis Tools and Visual Data Exploration and Analysis of Ultra-large Climate Data, with the intent to  deliver new capabilities to the climate-science community. This project’s vision is to provide large-scale visualization and analysis for both observational and model-generated climate data, with the goal of delivering new capabilities into the hands of the climate scientists. The integrated software product, the Ultrascale Visualization Climate Data Analysis Tools (UV-CDAT), is intended to be a powerful and complete front-end to a rich set of visual-data exploration and analysis capabilities well suited for climate-data analysis problems. UV-CDAT builds on the following key technologies: the Climate Data Analysis Tools (CDAT) framework; ParaView; VisTrails; and VisIt.
  
Highlights from Joan's presentation:<br />
+
Additional Info: The NCCS at GSFC is developing climate data analysis and viasualization tools for UV-CDAT, that provide data analysis capabilities for the Earth System Grid (ESG). These tools feature workflow interfaces, interactive 3D data exploration, hyper wall and stereo visualization, automated provenance generation, parallel task execution, and streaming data parallel pipelines. NASA’s DV3D is a UV-CDAT package that enables exploratory analysis of diverse and rich data sets from various sources including the Earth System Grid Federation (ESGF). Additionally, Python scripts can easily be generated.
- Provides an end user perspective for data analytics tools/technique needs:  Risk Analysis, trends of Near Real Time data<br />
 
- Need for linking continuous data from various sources<br />
 
- Use case: Linking Climate and Ar Quality<br />
 
  
 +
'''ClimatePipes Description''': ClimatePipes is a  web-based application platform/"IDE" for science data analysis. It can be used to create and run analysis workflows and visualizations.
 +
Additional Info:The front-end uses HTML5, WebGL, and CSS3 for geospatial visualizations. The back-end is built using the Visualization Toolkit (VTK), Climate Data Analysis Tools (CDAT), and other climate and geospatial data processing tools such as GDAL and PROJ4. ParaView Web, and D3, Canvas are also used for some visualizations, offers look-up tools, works with UVC-DAT and MongoDB. It can read NetCDF, offers a python Web Service infrastructure, supports workflows and provenance tools using VisTrails. Python was chosen as theserver-side language using CherryPy (http://www.cherrypy.org/) as the web server.  JQuery (http://jquery.com/) and Bootstrap are being used as the supporting frameworksfor a consistent interactive cross-browser experience.
  
Highlights from Rudy's presentation:<br />
 
- Also, provides an end user perspective for Air Quality Decision Systems needed analytics<br />
 
- DataFed provides a shared data pool (multiple sources), data browser, event screening, data and trend analysis<br />
 
  
 
+
Tiffany next continued discussion to answer the following questions:<br />
Highlights from Tiffany's presentation<br />
 
- From the data provider point of view, provides this excellent perspective:  "enable users to leverage data to observe more phenomena than what can be identified by studying an average<br />
 
- Discussed dataset inter-calibrarions, inter-comparisons, finding data that is meaningful, and being able to analyze original source data associated with higher level data of interest.<br />
 
 
 
 
 
Tiffany next led a discussion to answer the following questions:<br />
 
  
 
1. What are your most time consuming data tasks that can leverage analytics?<br />
 
1. What are your most time consuming data tasks that can leverage analytics?<br />
Line 70: Line 59:
  
  
(Of course,) We did not get through all questions, but after a very good discussion, '''we decided to post the questions on the 'ESDA discussion Forum' (http://wiki.esipfed.org/index.php/Earth_Science_Data_Analytics/Discussion_Forum) and continue discussion on the forum (I encourage all to participate with questions, answers, and experience)'''
+
Discussion focused on the different types of data analytics:<br />
 +
[[Image:onemoretype.png|500px]]
 +
 
 +
 
 +
 
 +
In particular, question 3, regarding use cases, and question 4, regarding tools and technologies, led to a 'tour', by Steve, through the ESDA information gathering pages.  Namely:<br />
 +
 
 +
'''Use Case Collection webpage''' - http://wiki.esipfed.org/index.php/Use_Case_Collection
 +
 
 +
'''Data Analytics Tools/Techniques Collection webpage''' - http://wiki.esipfed.org/index.php/Analytics_Tools
 +
 
 +
 
 +
In the spirit of compiling use cases, analytics tools/techniques, and performing gap analysis between use case analytics needs and available tools/techniques, telecon participants volunteered to provide data analytics use cases. This is as simple as providing the following information:
 +
 
 +
Use Case Name:
 +
Provided By:
 +
Brief Description:
 +
Key Analytics Needs:
 +
 
 +
ESDA members are all encouraged to provide use cases they may have come across or are faced with.  Thanks Beth, Robert, Suhung
  
  
Discussion highlights (thus far), focusing on the different types of data analytics:<br />
+
The 'tour' continued with a walk through the Data Analytics Tools/Techniques Collection webpage.  Upon soliciting for additional analytics tools/techniques, Tiffany offered a list that she has been compiling.  Others are encouraged to share, as well.
[[Image:onemoretype.png|500px]]
 
  
 +
You can edit the websites, or if easier, feel free to send use cases and tools to:  Steven.J.Kempler@nasa,gov.
  
  
- Getting data, in particular, meaningful data is very time consuming<br />
+
'''Additional Discussion:'''<br />
- Metadata is very useful in accessing and understanding data to determine its meaningfulness<br />
+
- Preparation for using specific analytics tools may be difficult, or not possible, if the tool can not support specific data characteristics. Beth discussed the ESIP Semantic Web Cluster, ToolMatch project, for us to track tools for data analytics:
- Using semantics to acquire information in metadata needs to be further pursued<br />
 
- Making data usable in system (i.e., analytics tool, decision support, etc.) is time consuming; Automating process is sometimes difficult<br />
 
  
- Types of analytics needed: Provider - Analytics to make data more usable<br />
+
ToolMatch Service (http://wiki.esipfed.org/index.php/ToolMatch)Finding Tools for Your Data & Data for Your Tools, ToolMatch is intended to be a service based on community-built semantic web applications that will provide data users with the means to match their datasets with a comprehensive list of useful, appropriate tools, and also provide data tool developers with datasets or data collections that will work with their tools.<br />
- Types of analytics neededProvider/User - For data integration; Combine data from 2 or more data sources; what isn the best way to do this (<-- end goal dependent)<br />
 
- This is the figure (I believe) Rudy was alluding to, when referring to Big Data Value Chain:
 
  
[[Image:analyticsvaluechain.tiff|500px]]
+
Next ToolMatch telecon: Tuesday May 27 at 4pm Eastern time <br />
 +
Call-in toll-free number (US/Canada): 1-877-668-4493, code: 231 033 48 <br />
 +
WebEx: https://esipfed.webex.com/esipfed/j.php?MTID=m98ad38879252b9000f6a489a8b2fad48,  If a password is required, enter the Meeting Password: 23103348  <br />
  
 
- Using analytics to combine data tools, and be able to reverse out of analytics to get back to the original data<br />
 
- Tools: Needed for identifying new information from a combination of existing data <br />
 
- Tools: For linking data to causes (thus working backwards: result --> cause --> data)<br />
 
- Tools: Data fusion - for example, for environmental data analysis<br />
 
  
- But…who should apply data analytics?<br />
+
- Exemplary Use CasesLooking for correlations across multiple variables; Bringing multiple datasets together utilizing Giovanni <br />
Producers (e.g., science teams), the data experts; Providers (e.g., data centers), who know how to build infrastructure/framework to support advancing data analysis; Users (e.g., researchers, decision support), who know exactly what their goals are<br />
 
- An answerAll… but the key, is to make sure knowledge, experience, and needs, are shared amongst all the groupings.<br />
 
  
 +
- Add use case column to tools inventory matrix to indicate who would find tools useful (i.e., data producer, user, etc.) <br />
  
'''Discussion continued on Discussion Forum:  http://wiki.esipfed.org/index.php/Earth_Science_Data_Analytics/Discussion_Forum'''
 
  
  
 
===Next Telecon:===
 
===Next Telecon:===
* May 22, 3:00 EST
+
* June 26, 3:00 EST
 
* Agenda (as of now)
 
* Agenda (as of now)
  
 
- Listen and Learn - We will have 2 guest speakers to discuss their Analytics activities
 
- Listen and Learn - We will have 2 guest speakers to discuss their Analytics activities
  
- Continued discussion from last telecom: Types of Analytics, and Tools/Techniques best suited for each type
+
- ESDA Activities - Use Case Collection webpage - http://wiki.esipfed.org/index.php/Use_Case_Collection
  
- ESDA Activities - Use Case Collection webpage - http://wiki.esipfed.org/index.php/Use_Case_Collection
+
- Preparation for Frisco

Latest revision as of 09:42, December 4, 2015

ESDA Telecom notes – 5/22/14

Known Attendees:

ESIP Host (Erin). Tiffany Mathews, Robert Casey, stevek, chung-lin shie, Sara Graves, Joan Aron, Emily Law, Beth Huffer, Bob Chen, H. Joe Lee, Brand Niemann, Robert Downs, suhung shen

Agenda:

1 – Steve Kempler - Recap of last telecon


2 – Tiffany Matthews – Describe/Demonstrate UV CDAT and ClimatePipes visualization analytics tools

UV CAT: http://uv-cdat.llnl.gov/


3 – Tiffany - To lead discussion started last week: 'enabling users to leverage data to observe more phenomena than what can be identified when studying an average'.

Tiffany will continue discussion with her presentation entitled: " Atmospheric Science Data Center Sample Data Analytics Use Cases."


4 – Steve - Present new Cluster Information Sharing Websites

Earth Science Data Analytics Discussion Forum - http://wiki.esipfed.org/index.php/Earth_Science_Data_Analytics/Discussion_Forum

Use Case Collection webpage - http://wiki.esipfed.org/index.php/Use_Case_Collection

Data Analytics Tools/Techniques Collection webpage - http://wiki.esipfed.org/index.php/Analytics_Tools


Presentations:


Notes:

Today, Tiffany provided a demonstration of UV CVAT, and described ClimatyePipes, two visualization analytics tools. Tiffany followed with a continuation of last month's discussion on different types of data analytics. Steve followed, showing the ESDA Use Case Gathering website, and Data Analytics Tools/Techniques Inventory website.


From Tiffany's UV CVAT and ClimatePipes demonstration/discussion

UV-CDAT http://uv-cdat.llnl.gov/ Description: UV-CDAT brings together two active projects -- Ultrascale Visualization Climate Data Analysis Tools and Visual Data Exploration and Analysis of Ultra-large Climate Data, with the intent to deliver new capabilities to the climate-science community. This project’s vision is to provide large-scale visualization and analysis for both observational and model-generated climate data, with the goal of delivering new capabilities into the hands of the climate scientists. The integrated software product, the Ultrascale Visualization Climate Data Analysis Tools (UV-CDAT), is intended to be a powerful and complete front-end to a rich set of visual-data exploration and analysis capabilities well suited for climate-data analysis problems. UV-CDAT builds on the following key technologies: the Climate Data Analysis Tools (CDAT) framework; ParaView; VisTrails; and VisIt.

Additional Info: The NCCS at GSFC is developing climate data analysis and viasualization tools for UV-CDAT, that provide data analysis capabilities for the Earth System Grid (ESG). These tools feature workflow interfaces, interactive 3D data exploration, hyper wall and stereo visualization, automated provenance generation, parallel task execution, and streaming data parallel pipelines. NASA’s DV3D is a UV-CDAT package that enables exploratory analysis of diverse and rich data sets from various sources including the Earth System Grid Federation (ESGF). Additionally, Python scripts can easily be generated.

ClimatePipes Description: ClimatePipes is a web-based application platform/"IDE" for science data analysis. It can be used to create and run analysis workflows and visualizations. Additional Info:The front-end uses HTML5, WebGL, and CSS3 for geospatial visualizations. The back-end is built using the Visualization Toolkit (VTK), Climate Data Analysis Tools (CDAT), and other climate and geospatial data processing tools such as GDAL and PROJ4. ParaView Web, and D3, Canvas are also used for some visualizations, offers look-up tools, works with UVC-DAT and MongoDB. It can read NetCDF, offers a python Web Service infrastructure, supports workflows and provenance tools using VisTrails. Python was chosen as theserver-side language using CherryPy (http://www.cherrypy.org/) as the web server. JQuery (http://jquery.com/) and Bootstrap are being used as the supporting frameworksfor a consistent interactive cross-browser experience.


Tiffany next continued discussion to answer the following questions:

1. What are your most time consuming data tasks that can leverage analytics?
2. Identify and discuss different types of analytics
3. What kind of data analytics is needed for specific use cases?
4. Identify tools and technologies that address different types of analytics


Discussion focused on the different types of data analytics:
Onemoretype.png


In particular, question 3, regarding use cases, and question 4, regarding tools and technologies, led to a 'tour', by Steve, through the ESDA information gathering pages. Namely:

Use Case Collection webpage - http://wiki.esipfed.org/index.php/Use_Case_Collection

Data Analytics Tools/Techniques Collection webpage - http://wiki.esipfed.org/index.php/Analytics_Tools


In the spirit of compiling use cases, analytics tools/techniques, and performing gap analysis between use case analytics needs and available tools/techniques, telecon participants volunteered to provide data analytics use cases. This is as simple as providing the following information:

Use Case Name: Provided By: Brief Description: Key Analytics Needs:

ESDA members are all encouraged to provide use cases they may have come across or are faced with. Thanks Beth, Robert, Suhung


The 'tour' continued with a walk through the Data Analytics Tools/Techniques Collection webpage. Upon soliciting for additional analytics tools/techniques, Tiffany offered a list that she has been compiling. Others are encouraged to share, as well.

You can edit the websites, or if easier, feel free to send use cases and tools to: Steven.J.Kempler@nasa,gov.


Additional Discussion:
- Preparation for using specific analytics tools may be difficult, or not possible, if the tool can not support specific data characteristics. Beth discussed the ESIP Semantic Web Cluster, ToolMatch project, for us to track tools for data analytics:

ToolMatch Service (http://wiki.esipfed.org/index.php/ToolMatch): Finding Tools for Your Data & Data for Your Tools, ToolMatch is intended to be a service based on community-built semantic web applications that will provide data users with the means to match their datasets with a comprehensive list of useful, appropriate tools, and also provide data tool developers with datasets or data collections that will work with their tools.

Next ToolMatch telecon: Tuesday May 27 at 4pm Eastern time
Call-in toll-free number (US/Canada): 1-877-668-4493, code: 231 033 48
WebEx: https://esipfed.webex.com/esipfed/j.php?MTID=m98ad38879252b9000f6a489a8b2fad48, If a password is required, enter the Meeting Password: 23103348


- Exemplary Use Cases: Looking for correlations across multiple variables; Bringing multiple datasets together utilizing Giovanni

- Add use case column to tools inventory matrix to indicate who would find tools useful (i.e., data producer, user, etc.)


Next Telecon:

  • June 26, 3:00 EST
  • Agenda (as of now)

- Listen and Learn - We will have 2 guest speakers to discuss their Analytics activities

- ESDA Activities - Use Case Collection webpage - http://wiki.esipfed.org/index.php/Use_Case_Collection

- Preparation for Frisco