Cloud Telecons 09/26/2016

From Earth Science Information Partners (ESIP)

Telecon Info


To start the online portion of the Personal Conference meeting


  1. Go to https://global.gotomeeting.com/join/445841573
  2. You can also dial in using your phone: United States +1 (571) 317-3112
  3. Access Code: 445-841-573

September 26, 2016 ESIP cloud computing cluster telecon recap

Participants:Bruce Caron, Fei Hu, Caller o2, Chaowei Yang, Frack Greguska, Gill, Kevin M, Jeffrey R Hall, Joseph C Jacob, Li Angela W., Nga T Quach, Stephana Klene, Vakhnin Andrey A , Walter Jeff (this list might miss somebody else. If there is anyone else, please help add them here. Thanks!)

Presentation: NEXUS Framework for Data Ingestion and Processing (Frank Greguska)

Two scalable databases:

  • Cassandra: break into some tiles -> database -> tiles
  • Solr DB cluster: Index chunks spatially -> identify data quickly -> real time data analytics

NEXUS Interface: draw bounding box

Big data problems: 4 V’s : volume 14.6PB, velocity 16.0 TB/Day, Variety 9,462 datasets, veracity different level products have different quality


Upcoming Big data mission: surface water and Ocean Topography(SWOT)
Volume:

  • Tiling algorithms : Solr for spatial extents and metadata; cassandra: tile data; horizontal scalable

Variety:

  • configurable streams: multiple tiling algorithms, L2 Swath data, L3/L4 Gridded data
  • customized tiling algorithms

Velocity:

  • Horizontally scalable streams
  • Cloud: OpenStack, Amazon => NEXUS deployed on cloud
how do you deploy NEXUS on cloud? How to scale up and scale down

Veracity:

  • pluggable validation checks: there are actual data in the tile
  • data transformation

How to implement?

  • Spring XD: Zookeeper, Kafka for node communication, Redis
  • Spring Cloud Data Flow

If you have any more questions, please send them to greguska@jpl.nasa.gov