Difference between revisions of "Cloud Telecons 09/23/2016"
(Created page with "== Telecon Info == ----------------------------------------------------------------------------------------------- To start the online portion of the Personal Conference meet...") |
|||
(One intermediate revision by the same user not shown) | |||
Line 12: | Line 12: | ||
(this list might miss somebody else. If there is anyone else, please help add them here. Thanks!)<br> | (this list might miss somebody else. If there is anyone else, please help add them here. Thanks!)<br> | ||
− | '''Presentation: NEXUS Framework for Data Ingestion and Processing (Frank Greguska) | + | ''' |
+ | ==== Presentation: NEXUS Framework for Data Ingestion and Processing (Frank Greguska) ===== | ||
− | Two scalable databases: | + | '''Two scalable databases:''' |
* Cassandra: break into some tiles -> database -> tiles | * Cassandra: break into some tiles -> database -> tiles | ||
* Solr DB cluster: Index chunks spatially -> identify data quickly -> real time data analytics | * Solr DB cluster: Index chunks spatially -> identify data quickly -> real time data analytics | ||
− | NEXUS Interface: draw bounding box | + | '''NEXUS Interface:''' draw bounding box |
− | Big data problems: 4 V’s : volume 14.6PB, velocity 16.0 TB/Day, Variety 9,462 datasets, veracity different level products have different quality | + | '''Big data problems:''' 4 V’s : volume 14.6PB, velocity 16.0 TB/Day, Variety 9,462 datasets, veracity different level products have different quality |
− | Upcoming Big data mission: surface water and Ocean Topography(SWOT) | + | '''Upcoming Big data mission: surface water and Ocean Topography(SWOT)'''<br /> |
Volume: | Volume: | ||
* Tiling algorithms : Solr for spatial extents and metadata; cassandra: tile data; horizontal scalable | * Tiling algorithms : Solr for spatial extents and metadata; cassandra: tile data; horizontal scalable | ||
Line 29: | Line 30: | ||
* configurable streams: multiple tiling algorithms, L2 Swath data, L3/L4 Gridded data | * configurable streams: multiple tiling algorithms, L2 Swath data, L3/L4 Gridded data | ||
* customized tiling algorithms | * customized tiling algorithms | ||
− | |||
Velocity: | Velocity: | ||
* Horizontally scalable streams | * Horizontally scalable streams | ||
Line 38: | Line 38: | ||
* data transformation | * data transformation | ||
− | How to implement? | + | '''How to implement?''' |
* Spring XD: Zookeeper, Kafka for node communication, Redis | * Spring XD: Zookeeper, Kafka for node communication, Redis | ||
* Spring Cloud Data Flow | * Spring Cloud Data Flow | ||
If you have any more questions, please send them to greguska@jpl.nasa.gov | If you have any more questions, please send them to greguska@jpl.nasa.gov |
Latest revision as of 14:12, October 24, 2016
Telecon Info
To start the online portion of the Personal Conference meeting
- Go to https://global.gotomeeting.com/join/445841573
- You can also dial in using your phone: United States +1 (571) 317-3112
- Access Code: 445-841-573
September 26, 2016 ESIP cloud computing cluster telecon recap
Participants:Bruce Caron, Fei Hu, Caller o2, Chaowei Yang, Frack Greguska, Gill, Kevin M, Jeffrey R Hall, Joseph C Jacob, Li Angela W., Nga T Quach, Stephana Klene, Vakhnin Andrey A , Walter Jeff
(this list might miss somebody else. If there is anyone else, please help add them here. Thanks!)
Presentation: NEXUS Framework for Data Ingestion and Processing (Frank Greguska) =
Two scalable databases:
- Cassandra: break into some tiles -> database -> tiles
- Solr DB cluster: Index chunks spatially -> identify data quickly -> real time data analytics
NEXUS Interface: draw bounding box
Big data problems: 4 V’s : volume 14.6PB, velocity 16.0 TB/Day, Variety 9,462 datasets, veracity different level products have different quality
Upcoming Big data mission: surface water and Ocean Topography(SWOT)
Volume:
- Tiling algorithms : Solr for spatial extents and metadata; cassandra: tile data; horizontal scalable
Variety:
- configurable streams: multiple tiling algorithms, L2 Swath data, L3/L4 Gridded data
- customized tiling algorithms
Velocity:
- Horizontally scalable streams
- Cloud: OpenStack, Amazon => NEXUS deployed on cloud
- how do you deploy NEXUS on cloud? How to scale up and scale down
Veracity:
- pluggable validation checks: there are actual data in the tile
- data transformation
How to implement?
- Spring XD: Zookeeper, Kafka for node communication, Redis
- Spring Cloud Data Flow
If you have any more questions, please send them to greguska@jpl.nasa.gov