January 2011 Winter Meeting Cloud Computing Session

From Earth Science Information Partners (ESIP)

Cloud Computing

Cloud Computing for Earth Science

Moderator: Phil Yang, NASA Goddard/GMU

Tuesday Jan 4

Many Earth science problems cannot be explored by single computers and solved within a single science community, but through distributed computing paradigms and models interdisciplinary efforts, such problems can be tackled effectively. The emergence of cloud computing provides a potential solution to enable the addressing of the Earth science problems. This session provides the latest development on how cloud computing can help Earth sciences and how Earth sciences can help to shape cloud computing?

This session will include three talks:

1) Doug Nebert from FGDC will introduce GeoCloud, a cross-agency initiative led by FGDC. In early 2010, FGDC summoned government agencies such as Census, NOAA, USGS, and USDA to deploy their geospatial products and applications onto a cloud environment. Objectives of this initiative are to define common operating system and software suites for geospatial applications, explore and document deployment and management strategies, monitor usage and costing of Cloud services in an operational environment, and pursue shared system security profiles for such solutions. The result of the project will serve as an guidance for governmental agencies in the future when considering Cloud service adoption for geospatial capabilities.

2) Qunying Huang from George Mason University Center for Intelligent Spatial Computing will introduce a joint project between the GeoCloud and GEO from their experiences on deploying the GEOSS clearinghouse to a cloud platform. The deployment of geospatial applications onto Amazon EC2 cloud computing platform will be introduced. Issues and Research will be reported about the leverage of cloud computing for the operational system.

Notes from Session

  • Geocloud Sandbox initiative was created as an Architecture and technology working group activity in December of 2009. Created to nominate geospatial applications for testing in the Cloud environment for 1 year prototype.
  • 2 deployment environments were abstracted from the projects, 1 open source service stack on linux64 other is windows 2008 stack.
  • Cloud computing could create a ceiling payment for use similar to that of a utility bill.
  • Platform is the main idea of going for cloud computing. Platform is a service that delivers solution stack as a service generally consuming cloud infrastructure and supporting cloud applications. It facilitates deployment of applications without the cost and complexity of buying and managing the underlying hardware and software layers.
  • Geocloud is piloting the deployment of infrastructure as a service
  • Hope to save money in hardware operations and scalability, reduce maintenance, cost effective testing
  • Platform consists of application servers platform enablers app frameworks and runtime systems.
  • Basic image :Windows 2008 hardened open source linex centos then harden and build base platforms to open source additions such as java, php, etc and open souce core like apache, postgres, java, ruby on rails, specialize for target apps * * * ArcGIS server Geospatial platform semantic drupal glassfish, open geo, geoserver network, thredds, geospatial hhs and semantic apps
  • Cost evaluation for each of the initial project was performed based on data transfer story
  • Most projects could be hosted inAWS at 350-500$ a month
  • Amazon Web Services was selected as the primary public cloud computing environement for various sizes and numbers of virtual machines.
  • Dell vmwarevcloud environment was selected for government hosted cloud infrastructure.
  • USA Arc GIS Geoss LINUx64 NOAA Linux64
  • Working on this January- march

  • Geocloud deployed ten geospatial application projects in the vloud network
  • Cloud computing a model that enables shareing on demand network access to share pooled computing resources that can be rapidly provisioned and released
  • On demand self service, multitenancy measured services device and location independence rapid elasticity
  • SaaS software as service gmail skype facebook
  • PaaS platform as a service windows Azure google app engine
  • Infrastructure as a service IAAS AWS
  • Elastic Compute Cloud EC2 IAAS
  • Simple Storage service S3 Iaas
  • Elastic Block storage EBS IAAS
  • EC2 a web service that provides resizable compute capacity in the cloud
  • AMI amazon machine image a bootable vm image which can be launched as a
  • EC2 instance
  • Scalability load balancer
  • Reliability network disaster recovery
  • Reducing duplicated efforts infrastructure and development
  • We are at a prescient time
  • Technologies cloud architecture, platform independent languages, Open data standards
  • Challenges?: network bottlenecks data transfer, performance unpredictability, data and personal privacy, scalable storage computing power Amazon EBS, Bugs in large distributed systems