Cloud Computing

From Earth Science Information Partners (ESIP)
Revision as of 21:57, January 13, 2011 by Erinmr (talk | contribs)

Cloud Computing for Earth Science

Tuesday Jan 4

2:00-2:30pm GeoCloud, Doug Nebert, FGDC
3:00-3:00pm Cloud Enabled GEOSS clearinghouse, Qunying Huang, GMU
3:00-3:30pm NASA Cloud Services, Phil Yang, NASA GSFC

Many Earth science problems cannot be explored by single computers and solved within a single science community, but through distributed computing paradigms and models interdisciplinary efforts, such problems can be tackled effectively. The emergence of cloud computing provides a potential solution to enable the addressing of the Earth science problems. This session provides the latest development on how cloud computing can help Earth sciences and how Earth sciences can help to shape cloud computing?

This session will include three talks:

1) Doug Nebert from FGDC will introduce GeoCloud, a cross-agency initiative led by FGDC. In early 2010, FGDC summoned government agencies such as Census, NOAA, USGS, and USDA to deploy their geospatial products and applications onto a cloud environment. Objectives of this initiative are to define common operating system and software suites for geospatial applications, explore and document deployment and management strategies, monitor usage and costing of Cloud services in an operational environment, and pursue shared system security profiles for such solutions. The result of the project will serve as an guidance for governmental agencies in the future when considering Cloud service adoption for geospatial capabilities.

2) Qunying Huang from George Mason University Center for Intelligent Spatial Computing will introduce a joint project between the GeoCloud and GEO from their experiences on deploying the GEOSS clearinghouse to a cloud platform. The deployment of geospatial applications onto Amazon EC2 cloud computing platform will be introduced. Issues and Research will be reported about the leverage of cloud computing for the operational system.

3) Chaowei Phil Yang, the lead architect of NASA cloud services hosted by Goddard Space Flight Center, will introduce the NASA Cloud Services.

Notes from Session

Geocloud Sandbox initiative was created as an Architecture and technology working group activity in December of 2009. Created to nominate geospatial applications for testing in the Cloud environment for 1 year prototype.

  • 2 deployment environments were abstracted from the projects, 1 open source service stack on linux64 other is windows 2008 stack.
  • Cloud computing could create a ceiling payment for use similar to that of a utility bill.
  • Platform is the main idea of going for cloud computing. Platform is a service that delivers solution stack as a service generally consuming cloud infrastructure and supporting cloud applications. It facilitates deployment of applications without the cost and complexity of buying and managing the underlying hardware and software layers.
  • Geocloud is piloting the deployment of infrastructure as a service
  • Hope to save money in hardware operations and scalability, reduce maintenance, cost effective testing
  • Platform consists of application servers platform enablers app frameworks and runtime systems.
  • Basic image :Windows 2008 hardened open source linex centos then harden and build base platforms to open source additions such as java, php, etc and open souce core like apache, postgres, java, ruby on rails, specialize for target apps * * * ArcGIS server Geospatial platform semantic drupal glassfish, open geo, geoserver network, thredds, geospatial hhs and semantic apps
  • Cost evaluation for each of the initial project was performed based on data transfer story
  • Most projects could be hosted inAWS at 350-500$ a month
  • Amazon Web Services was selected as the primary public cloud computing environement for various sizes and numbers of virtual machines.
  • Dell vmwarevcloud environment was selected for government hosted cloud infrastructure.
  • USA Arc GIS Geoss LINUx64 NOAA Linux64
  • Working on this January- march

Doug Nebert


  • Geocloud deployed ten geospatial application projects in the vloud network
  • Cloud computing a model that enables shareing on demand network access to share pooled computing resources that can be rapidly provisioned and released
  • On demand self service, multitenancy measured services device and location independence rapid elasticity
  • SaaS software as service gmail skype facebook
  • PaaS platform as a service windows Azure google app engine
  • Infrastructure as a service IAAS AWS
  • Elastic Compute Cloud EC2 IAAS
  • Simple Storage service S3 Iaas
  • Elastic Block storage EBS IAAS
  • EC2 a web service that provides resizable compute capacity in the cloud
  • AMI amazon machine image a bootable vm image which can be launched as a
  • EC2 instance
  • Scalability load balancer
  • Reliability network disaster recovery
  • Reducing duplicated efforts infrastructure and development
  • We are at a prescient time
  • Technologies cloud architecture platform independent languages

Open data standards

  • Challenges?: network bottlenecks data transfer, performance unpredictability, data and personal privacy, scalable storage computing power Amazon EBS, Bugs in large distributed systems

Earth Science collaboration Discussion

  • 2011 Suggestions continue webinar
  • Invite external speakers, need to create a wish list of speakers, mix In more hands on tutorials, capture the presentations as videos
  • ESIP info commons; esip uses drupal site poorly put together and needs to be redone
  • Esip tried Google Knol but it was buggy, hard to create a uniform look, not intuitive to create one, not sure how to handle multiple authors. Links only to your gmail identity, linking esip collection is not straightforward
  • Decide to go with drupal extension and have a prototype ready by summer
  • Summer meeting; blue sky session, future of science data and info systems

Rahul

Suggestion for construction of an Earth Science Collaboratory

  • Convergent evolution to ESC
  • ESC= rich data analysis environment that provides access across a wide spectrum of Earth science data, provides a diverse set of science analysis services and tools, supports the application of services and tools , supports collaboration on data analysis, supports sharing of data tools etc.
  • Several graphics were shown showcasing the applications of using the ESC
  • Mediator Mediates combinations tool with data data with data tool with tool tool with workflow data with workflow
  • Based on data access standards common data model
  • Cyber infrastructure services used by all the other components: security, social, cloud, discovery, information mgmt, semantic web, evaluation
  • Advantages of ESC tool availability will be a force multiplier, more tools will be usable
  • Knowledge sharing will evolve from test on paper to a mixture of data tools workflows and articles
  • A wikihow for earth science data will emerge
  • ESC will maintain a record of the analysis process
  • Why now? Cause its possible and the need is growing, cloud helps with provisioning resources tools workflows etc multiple data sets
  • 7 year plan of the ideas of what to do

Chris Lynnes

KEVIN

  • Context nasa earth science data systems is a large continuing investment, websites are the front door to data and service for users but are discordant