Difference between revisions of "Cloud Telecons 09/23/2016"

From Earth Science Information Partners (ESIP)
(Created page with "== Telecon Info == ----------------------------------------------------------------------------------------------- To start the online portion of the Personal Conference meet...")
 
 
(One intermediate revision by the same user not shown)
Line 12: Line 12:
 
(this list might miss somebody else. If there is anyone else, please help add them here. Thanks!)<br>
 
(this list might miss somebody else. If there is anyone else, please help add them here. Thanks!)<br>
  
'''Presentation: NEXUS Framework for Data Ingestion and Processing (Frank Greguska)'''
+
'''
 +
==== Presentation: NEXUS Framework for Data Ingestion and Processing (Frank Greguska) =====
  
Two scalable databases:
+
'''Two scalable databases:'''
 
* Cassandra: break into some tiles -> database -> tiles  
 
* Cassandra: break into some tiles -> database -> tiles  
 
* Solr DB cluster: Index chunks spatially -> identify data quickly -> real time data analytics
 
* Solr DB cluster: Index chunks spatially -> identify data quickly -> real time data analytics
  
NEXUS Interface: draw bounding box  
+
'''NEXUS Interface:''' draw bounding box  
  
Big data problems: 4 V’s : volume 14.6PB, velocity 16.0 TB/Day, Variety 9,462 datasets, veracity different level products have different quality  
+
'''Big data problems:''' 4 V’s : volume 14.6PB, velocity 16.0 TB/Day, Variety 9,462 datasets, veracity different level products have different quality  
  
  
Upcoming Big data mission: surface water and Ocean Topography(SWOT)  
+
'''Upcoming Big data mission: surface water and Ocean Topography(SWOT)'''<br />
 
Volume:
 
Volume:
 
* Tiling algorithms : Solr for spatial extents and metadata; cassandra: tile data; horizontal scalable
 
* Tiling algorithms : Solr for spatial extents and metadata; cassandra: tile data; horizontal scalable
Line 29: Line 30:
 
* configurable streams: multiple tiling algorithms, L2 Swath data, L3/L4 Gridded data
 
* configurable streams: multiple tiling algorithms, L2 Swath data, L3/L4 Gridded data
 
* customized tiling algorithms
 
* customized tiling algorithms
 
 
Velocity:
 
Velocity:
 
* Horizontally scalable streams
 
* Horizontally scalable streams
Line 38: Line 38:
 
* data transformation
 
* data transformation
  
How to implement?
+
'''How to implement?'''
 
* Spring XD: Zookeeper, Kafka for node communication, Redis
 
* Spring XD: Zookeeper, Kafka for node communication, Redis
 
* Spring Cloud Data Flow  
 
* Spring Cloud Data Flow  
  
 
If you have any more questions, please send them to greguska@jpl.nasa.gov
 
If you have any more questions, please send them to greguska@jpl.nasa.gov

Latest revision as of 14:12, October 24, 2016

Telecon Info


To start the online portion of the Personal Conference meeting


  1. Go to https://global.gotomeeting.com/join/445841573
  2. You can also dial in using your phone: United States +1 (571) 317-3112
  3. Access Code: 445-841-573

September 26, 2016 ESIP cloud computing cluster telecon recap

Participants:Bruce Caron, Fei Hu, Caller o2, Chaowei Yang, Frack Greguska, Gill, Kevin M, Jeffrey R Hall, Joseph C Jacob, Li Angela W., Nga T Quach, Stephana Klene, Vakhnin Andrey A , Walter Jeff (this list might miss somebody else. If there is anyone else, please help add them here. Thanks!)

Presentation: NEXUS Framework for Data Ingestion and Processing (Frank Greguska) =

Two scalable databases:

  • Cassandra: break into some tiles -> database -> tiles
  • Solr DB cluster: Index chunks spatially -> identify data quickly -> real time data analytics

NEXUS Interface: draw bounding box

Big data problems: 4 V’s : volume 14.6PB, velocity 16.0 TB/Day, Variety 9,462 datasets, veracity different level products have different quality


Upcoming Big data mission: surface water and Ocean Topography(SWOT)
Volume:

  • Tiling algorithms : Solr for spatial extents and metadata; cassandra: tile data; horizontal scalable

Variety:

  • configurable streams: multiple tiling algorithms, L2 Swath data, L3/L4 Gridded data
  • customized tiling algorithms

Velocity:

  • Horizontally scalable streams
  • Cloud: OpenStack, Amazon => NEXUS deployed on cloud
how do you deploy NEXUS on cloud? How to scale up and scale down

Veracity:

  • pluggable validation checks: there are actual data in the tile
  • data transformation

How to implement?

  • Spring XD: Zookeeper, Kafka for node communication, Redis
  • Spring Cloud Data Flow

If you have any more questions, please send them to greguska@jpl.nasa.gov