Difference between revisions of "Cloud Computing Cluster Plan 2021"

From Earth Science Information Partners (ESIP)
(Added page)
 
Line 1: Line 1:
= '''ESIP Cloud Computing Cluster 2021 Plan''' =
+
='''ESIP Cloud Computing Cluster 2021 Plan'''=
  
* Co-Chair(s): Aimee Barciauskas, Sudhir Shrestha
+
*Co-Chair(s): Aimee Barciauskas, Sudhir Shrestha
* Website (wiki): [[Index.php/|http://wiki.esipfed.org/index.php/]]....
+
*Website (wiki): [[Index.php/|http://wiki.esipfed.org/index.php/]]....
* Monthly Meeting Day/Time: 4th Monday of every month at 1pm ET
+
*Monthly Meeting Day/Time: 4th Monday of every month at 1pm ET
  
== '''The Cloud Computing Cluster 2021 Objective is to create more cloud experts.''' ==
+
=='''The Cloud Computing Cluster 2021 Objective is to create more cloud experts.'''==
 
'''Create more cloud experts.'''
 
'''Create more cloud experts.'''
  
 
The ESIP Cloud Computing Cluster aims to create more “cloud experts”: Earth data science users using cloud resources science getting done either faster or at greater scale. The Cloud Computing Cluster will match cloud technologies with Earth data users and applications, specifically in the Earth sciences but also with decision makers and the general public in mind.
 
The ESIP Cloud Computing Cluster aims to create more “cloud experts”: Earth data science users using cloud resources science getting done either faster or at greater scale. The Cloud Computing Cluster will match cloud technologies with Earth data users and applications, specifically in the Earth sciences but also with decision makers and the general public in mind.
  
== '''Things we may do to fulfill our objectives:''' ==
+
=='''Things we will do to fulfill our objectives:'''==
 
One of the challenges is this cluster is its broad cross-cutting mandate to create more cloud experts. Topics span science on the cloud to infrastructure as code. Stakeholders span scientists to decision makers to software developers.
 
One of the challenges is this cluster is its broad cross-cutting mandate to create more cloud experts. Topics span science on the cloud to infrastructure as code. Stakeholders span scientists to decision makers to software developers.
  
 
We can address these challenges through our general and targeted approaches:
 
We can address these challenges through our general and targeted approaches:
  
* This cluster is a home to ambiguous cross-over and cross-cutting topics.
+
*This cluster is a home to ambiguous cross-over and cross-cutting topics.
* This cluster acknowledges that there is not one set of best practices or tools that will address all applications and experience levels.
+
*This cluster acknowledges that there is not one set of best practices or tools that will address all applications and experience levels.
* This cluster will meet 1x / month to participate in '''knowledge sharing''' and conversation.
+
*This cluster will meet 1x / month to participate in '''knowledge sharing''' and conversation.
* This cluster will (optionally) meet 1x / month to have a '''working session''' where we work on disseminating and formalizing the outputs of the cluster.
+
*This cluster will (optionally) meet 1x / month to have a '''working session''' where we work on disseminating and formalizing the outputs of the cluster.
* This cluster will have quarterly themes where we will lead and engage in discussions about active topics of interest.
+
*This cluster will have quarterly themes where we will lead and engage in discussions about active topics of interest.
** '''The first quarter will be May 1, 2021 - July 31, 2021 and will focus on''' '''cloud-friendly data formats and tools.'''
+
**'''The first quarter will be May 1, 2021 - July 31, 2021 and will focus on''' '''cloud-friendly data formats and tools.'''
** This cluster will lead 1x / quarter #cloudclinics. These “live” events will focus on the theme of the quarter and include volunteer experts to field questions.
+
**This cluster will lead 1x / quarter #cloudclinics. These “live” events will focus on the theme of the quarter and include volunteer experts to field questions.
*** When will the first cloud clinic be? Perhaps during or right after the ESIP summer session.
+
***When will the first cloud clinic be? Perhaps during or right after the ESIP summer session.
*** Ongoing #cloudclinics twitter and slack channels for posting and answering questions
+
***Ongoing #cloudclinics twitter and slack channels for posting and answering questions
** Themes for latter quarters may be:
+
**Themes for latter quarters may be:
*** Infrastructure as code and devops best practices
+
***Infrastructure as code and devops best practices
*** Metadata standards, i.e. guidance on when to use STAC and when to use CMR
+
***Metadata standards, i.e. guidance on when to use STAC and when to use CMR
*** Open science platforms, such as pangeo
+
***Open science platforms, such as pangeo
* This cluster will create documentation on common and current cloud science topics such as:
+
*This cluster will create documentation on common and current cloud science topics such as:
** Which metadata standard is right for my use case? When to use STAC, CMR, OGC
+
**Which metadata standard is right for my use case? When to use STAC, CMR, OGC
** What is the right cloud-optimized data format for my data and my science use case?
+
**What is the right cloud-optimized data format for my data and my science use case?
** How do I find the right publicly available data on the cloud for my use case?
+
**How do I find the right publicly available data on the cloud for my use case?
* ESIP Summer session “The saga continues: new advances in cloud-friendly data formats”
+
*ESIP Summer session “The saga continues: new advances in cloud-friendly data formats”
** MRF+CRF, Zarr, EPT, COG, NetCDF4/HDF5, Grib2, TileDB, Ohmy
+
**MRF+CRF, Zarr, EPT, COG, NetCDF4/HDF5, Grib2, TileDB, Ohmy
** Zarr metadata for NetCDF4/HDF5
+
**Zarr metadata for NetCDF4/HDF5
** Pangeo-forge - build your own cloud-friendly dataset
+
**Pangeo-forge - build your own cloud-friendly dataset
** What is the metadata required?
+
**What is the metadata required?
** What should be the durability/ephemeral nature of cloud-native formats?
+
**What should be the durability/ephemeral nature of cloud-native formats?
** How can data formats help solve the multi-cloud multi-region challenge?
+
**How can data formats help solve the multi-cloud multi-region challenge?
** '''Format for one need (visualization) may be different than another (analysis)'''
+
**'''Format for one need (visualization) may be different than another (analysis)'''
*** Xarray can be used for multi-dimensional analysis but is challenging to visualize
+
***Xarray can be used for multi-dimensional analysis but is challenging to visualize
*** What are some cool solutions to this challenge?
+
***What are some cool solutions to this challenge?
  
== '''Things our collaboration area needs to deliver our objectives?''' ==
+
=='''Things our collaboration area needs to deliver our objectives?'''==
 
''(e.g. Partnerships, in-kind support, staff support)''
 
''(e.g. Partnerships, in-kind support, staff support)''
  
== '''How will we know we are on the right track?''' ==
+
=='''How will we know we are on the right track?'''==
  
* Engineers and scientists attend and are interested in our webinar series
+
*Engineers and scientists attend and are interested in our webinar series
* Participation in live #cloudclinics
+
*Participation in live #cloudclinics
* Activity on social channels, twitter and slack, for #cloudclinics
+
*Activity on social channels, twitter and slack, for #cloudclinics
  
== '''How will others know what we are doing in & out of ESIP?''' ==
+
=='''How will others know what we are doing in & out of ESIP?'''==
  
* Post webinars on the ESIP youtube channel
+
*Post webinars on the ESIP youtube channel
* Attend other sessions and advertise webinars and #cloudclinics
+
*Attend other sessions and advertise webinars and #cloudclinics
  
== '''Existing or Desired Cross-collaboration area connections''' ==
+
=='''Existing or Desired Cross-collaboration area connections'''==
  
* <nowiki>https://www.openscapes.org/blog/2021/03/10/nasa-announcement/</nowiki>
+
*<nowiki>https://www.openscapes.org/blog/2021/03/10/nasa-announcement/</nowiki>
* Pangeo
+
*Pangeo
  
== '''Prior/Existing ESIP and Cloud Computing Cluster Artifacts''' ==
+
=='''Prior/Existing ESIP and Cloud Computing Cluster Artifacts'''==
  
* [[2015-2020 Strategic Plan|https://wiki.esipfed.org/2015-2020_Strategic_Plan]]
+
*[[2015-2020 Strategic Plan|https://wiki.esipfed.org/2015-2020_Strategic_Plan]]
* [[Cluster tools|https://wiki.esipfed.org/Cluster_tools]]
+
*[[Cluster tools|https://wiki.esipfed.org/Cluster_tools]]
* <nowiki>https://esip.figshare.com/articles/online_resource/How_to_Cluster_in_ESIP/12827963/1</nowiki>
+
*<nowiki>https://esip.figshare.com/articles/online_resource/How_to_Cluster_in_ESIP/12827963/1</nowiki>
* [https://docs.google.com/document/d/1ko1uu11ydYMPYQQVDnQ2cK4WwAh4LfDFaE_GYdwFSMk/edit Cloud Computing Cluster Reboot]
+
*[https://docs.google.com/document/d/1ko1uu11ydYMPYQQVDnQ2cK4WwAh4LfDFaE_GYdwFSMk/edit Cloud Computing Cluster Reboot]
* [[Cloud Computing|https://wiki.esipfed.org/Cloud_Computing]]
+
*[[Cloud Computing|https://wiki.esipfed.org/Cloud_Computing]]
* [https://docs.google.com/spreadsheets/d/1aMs0wog-uvbRWmrDtHzc1Z5mz6r70iEOgxSwEsblPQs/edit#gid=0 Cloud Computing Invited Presentations]
+
*[https://docs.google.com/spreadsheets/d/1aMs0wog-uvbRWmrDtHzc1Z5mz6r70iEOgxSwEsblPQs/edit#gid=0 Cloud Computing Invited Presentations]
* [https://docs.google.com/document/d/1vuzQVqIcsKpbOiWEoxhrmcQGYg73uSQtL8M5l7YS65Y/edit# Cloud Computing Cluster Minutes]
+
*[https://docs.google.com/document/d/1vuzQVqIcsKpbOiWEoxhrmcQGYg73uSQtL8M5l7YS65Y/edit# Cloud Computing Cluster Minutes]

Revision as of 19:38, April 25, 2021

ESIP Cloud Computing Cluster 2021 Plan

The Cloud Computing Cluster 2021 Objective is to create more cloud experts.

Create more cloud experts.

The ESIP Cloud Computing Cluster aims to create more “cloud experts”: Earth data science users using cloud resources science getting done either faster or at greater scale. The Cloud Computing Cluster will match cloud technologies with Earth data users and applications, specifically in the Earth sciences but also with decision makers and the general public in mind.

Things we will do to fulfill our objectives:

One of the challenges is this cluster is its broad cross-cutting mandate to create more cloud experts. Topics span science on the cloud to infrastructure as code. Stakeholders span scientists to decision makers to software developers.

We can address these challenges through our general and targeted approaches:

  • This cluster is a home to ambiguous cross-over and cross-cutting topics.
  • This cluster acknowledges that there is not one set of best practices or tools that will address all applications and experience levels.
  • This cluster will meet 1x / month to participate in knowledge sharing and conversation.
  • This cluster will (optionally) meet 1x / month to have a working session where we work on disseminating and formalizing the outputs of the cluster.
  • This cluster will have quarterly themes where we will lead and engage in discussions about active topics of interest.
    • The first quarter will be May 1, 2021 - July 31, 2021 and will focus on cloud-friendly data formats and tools.
    • This cluster will lead 1x / quarter #cloudclinics. These “live” events will focus on the theme of the quarter and include volunteer experts to field questions.
      • When will the first cloud clinic be? Perhaps during or right after the ESIP summer session.
      • Ongoing #cloudclinics twitter and slack channels for posting and answering questions
    • Themes for latter quarters may be:
      • Infrastructure as code and devops best practices
      • Metadata standards, i.e. guidance on when to use STAC and when to use CMR
      • Open science platforms, such as pangeo
  • This cluster will create documentation on common and current cloud science topics such as:
    • Which metadata standard is right for my use case? When to use STAC, CMR, OGC
    • What is the right cloud-optimized data format for my data and my science use case?
    • How do I find the right publicly available data on the cloud for my use case?
  • ESIP Summer session “The saga continues: new advances in cloud-friendly data formats”
    • MRF+CRF, Zarr, EPT, COG, NetCDF4/HDF5, Grib2, TileDB, Ohmy
    • Zarr metadata for NetCDF4/HDF5
    • Pangeo-forge - build your own cloud-friendly dataset
    • What is the metadata required?
    • What should be the durability/ephemeral nature of cloud-native formats?
    • How can data formats help solve the multi-cloud multi-region challenge?
    • Format for one need (visualization) may be different than another (analysis)
      • Xarray can be used for multi-dimensional analysis but is challenging to visualize
      • What are some cool solutions to this challenge?

Things our collaboration area needs to deliver our objectives?

(e.g. Partnerships, in-kind support, staff support)

How will we know we are on the right track?

  • Engineers and scientists attend and are interested in our webinar series
  • Participation in live #cloudclinics
  • Activity on social channels, twitter and slack, for #cloudclinics

How will others know what we are doing in & out of ESIP?

  • Post webinars on the ESIP youtube channel
  • Attend other sessions and advertise webinars and #cloudclinics

Existing or Desired Cross-collaboration area connections

  • https://www.openscapes.org/blog/2021/03/10/nasa-announcement/
  • Pangeo

Prior/Existing ESIP and Cloud Computing Cluster Artifacts