Difference between revisions of "AQ Community server software"

From Earth Science Information Partners (ESIP)
(Created page with "This sessions is standards and conventions | Implementation for gridded and station data | Development tools | Server performance<br> '''Issues re. the use of netCDF and other d...")
 
Line 1: Line 1:
 
This sessions is standards and conventions | Implementation for gridded and station data | Development tools | Server performance<br>
 
This sessions is standards and conventions | Implementation for gridded and station data | Development tools | Server performance<br>
  
'''Issues re. the use of netCDF and other data formats'''<br>
+
==Issues re. the use of netCDF and other data formats==
 
netCDF is standard format for multi-dimensional data. Cf-netCDF is used both as an archival format of grid data as well as a payload format for WCS queries.  
 
netCDF is standard format for multi-dimensional data. Cf-netCDF is used both as an archival format of grid data as well as a payload format for WCS queries.  
 
* Issue: ambiguity and completeness of CF
 
* Issue: ambiguity and completeness of CF
Line 14: Line 14:
 
* Issue: Reading other gridded input data formats? (i.e. GRIB)
 
* Issue: Reading other gridded input data formats? (i.e. GRIB)
  
'''Data server performance issues/solutions'''
+
==Data server performance issues/solutions==
 
* Issue: especially big datasets take a long time to prepare for delivery (slicing/subsetting, etc.)
 
* Issue: especially big datasets take a long time to prepare for delivery (slicing/subsetting, etc.)
 
** direct streaming of datasets to the client could be part of the solution, [[Streaming_and_or_netCDF_File|click here]] for details
 
** direct streaming of datasets to the client could be part of the solution, [[Streaming_and_or_netCDF_File|click here]] for details
Line 33: Line 33:
 
*** should not violate WCS 1.1 standard (too badly) as only store=false is mandatory
 
*** should not violate WCS 1.1 standard (too badly) as only store=false is mandatory
  
'''Data Servers: Technical Realization (IT) Issues and Solutions'''<br>
+
==Data Servers: Technical Realization (IT) Issues and Solutions==
 
* which W*S protocol for which purpose, how to combine?
 
* which W*S protocol for which purpose, how to combine?
 
** WMS for display/preview of spatial data
 
** WMS for display/preview of spatial data
Line 41: Line 41:
 
** WCS 1.1 terminology: Service->Group of similar datasets; Coverage->Dataset; Field->Parameter; Flag->Flag
 
** WCS 1.1 terminology: Service->Group of similar datasets; Coverage->Dataset; Field->Parameter; Flag->Flag
  
'''Gridded data service through WCS'''<br>
+
==Gridded data service through WCS==
 
WCS is implemented in multiple versions: 1.0, 1.12, 2.0. The AQ Community Server (AComServ) is now implemented using WCS 1.1.2. This generally works well.
 
WCS is implemented in multiple versions: 1.0, 1.12, 2.0. The AQ Community Server (AComServ) is now implemented using WCS 1.1.2. This generally works well.
 
* Issue: Extraction of vertical levels
 
* Issue: Extraction of vertical levels
Line 55: Line 55:
 
* Issue: use WCS or WFS, Combination of both/which combination?  
 
* Issue: use WCS or WFS, Combination of both/which combination?  
  
'''Access rights'''<br>
+
==Access rights'==
 
* Issue: technical options to restrict access to datasets?
 
* Issue: technical options to restrict access to datasets?
  
'''Server co-development tools, methods'''<br>
+
==Server co-development tools, methods==
 
Server code is maintained through SourceForge, Darcs code repositories are available at WUSTL and in Juelich.
 
Server code is maintained through SourceForge, Darcs code repositories are available at WUSTL and in Juelich.
 
* Issues: Platform independence (netcdf interface), Documentation
 
* Issues: Platform independence (netcdf interface), Documentation
  
'''Relationship to non-AComServ (non-NetCDF) WCS servers'''<br>
+
==Relationship to non-AComServ (non-NetCDF) WCS servers==
 
* data format(s)
 
* data format(s)
 
** many WCS clients don't understand NetCDF
 
** many WCS clients don't understand NetCDF
Line 70: Line 70:
 
** will need a test suite for 1.1.2 (and manage to run it)
 
** will need a test suite for 1.1.2 (and manage to run it)
  
'''Real Data-to-WCS-Mapping tructure'''<br>
+
==Real Data-to-WCS-Mapping structure==
 
* Data hub that exposes the data  ==> Provider    ==>  WCS Service   
 
* Data hub that exposes the data  ==> Provider    ==>  WCS Service   
 
* Observation platform or network ==> Dataset    ==>  WCS Coverage  
 
* Observation platform or network ==> Dataset    ==>  WCS Coverage  
 
* Observation parameter/variable ==> Parameter ==> WCS Field
 
* Observation parameter/variable ==> Parameter ==> WCS Field

Revision as of 03:54, August 7, 2011

This sessions is standards and conventions | Implementation for gridded and station data | Development tools | Server performance

Issues re. the use of netCDF and other data formats

netCDF is standard format for multi-dimensional data. Cf-netCDF is used both as an archival format of grid data as well as a payload format for WCS queries.

  • Issue: ambiguity and completeness of CF
    • Issue: CF (udunits) time format not the same as ISO Time format (as used by WCS)
    • Issue: geo-referencing (also see CF-ML discussion "the need to store lat/lon coordinates in a CF-compliant netCDF file")
    • What is missing in CF?
    • sever independent CF-API Package
  • Issue: We should define a standard python interface (PyNIO, python-netcdf4, scipy.io.netcdf?)
  • Issue: other ouput formats
    • support fused into server or add-on concept (possibly using the public W*S/NetCDF interface)
    • Delivery of (small) data sets in ASCII/csv format?
  • Issue: Reading other gridded input data formats? (i.e. GRIB)

Data server performance issues/solutions

  • Issue: especially big datasets take a long time to prepare for delivery (slicing/subsetting, etc.)
    • direct streaming of datasets to the client could be part of the solution, click here for details
    • generated datasets could be cached for a while, so they could be delivered again when there is a request with compatible parameters
    • problem: both proposals might be mutually exclusive to some degree
  • Issue: XML Metadata assembly might take a long time depending on the catalogue content, i.e. with a lot of Identifiers
    • GetCapabilities response Metadata is very static anyway, other responses (DescribeCoverage) could be cached for a while
      • attention: DescribeCoverage response depends on parameters
  • Issue: management overhead when opening NetCDF
    • when opening a NetCDF file, some metadata has to be read and data structures have to be set up
      • input files could be kept open for a while to avoid this overhead
  • Issue: temp file space is limited on WCS server
    • streaming approach for store=false parameter would not requrie additional local storage
    • temp file approach for store=true parameter could be limited by a maximum dataset size
      • requires a reliable output file size estimator
      • server would return an exception if estimated size is over given threshold
      • would force people to use store=false for large datasets
      • should not violate WCS 1.1 standard (too badly) as only store=false is mandatory

Data Servers: Technical Realization (IT) Issues and Solutions

  • which W*S protocol for which purpose, how to combine?
    • WMS for display/preview of spatial data
    • WFS .. for station description/spatial metadata?
    • WCS for "everything else"? (gridded ("raw") datasets)
  • WCS Data structure hierarchy: DataHub; Service; Coverage: Field; Flag
    • WCS 1.1 terminology: Service->Group of similar datasets; Coverage->Dataset; Field->Parameter; Flag->Flag

Gridded data service through WCS

WCS is implemented in multiple versions: 1.0, 1.12, 2.0. The AQ Community Server (AComServ) is now implemented using WCS 1.1.2. This generally works well.

  • Issue: Extraction of vertical levels
  • Issue: current state of WCS 2.0?
    • core relased, but extensions still in draft (how do we know what is valid?)
  • Issue: serve "virtual" WCS datasets with continuous time line assembled from many source files
    • create a "wrapper" module that can handle such cases?
    • Kari has already done something like this for HTAP datasets, this could be a starting point
  • Issue: desirable time filtering options in WCS: hour of day, day of week, day of month, etc.
    • Kari has already created such filters, but so far they are outside the standard

Delivery of station-point data

  • Issue: use WCS or WFS, Combination of both/which combination?

Access rights'

  • Issue: technical options to restrict access to datasets?

Server co-development tools, methods

Server code is maintained through SourceForge, Darcs code repositories are available at WUSTL and in Juelich.

  • Issues: Platform independence (netcdf interface), Documentation

Relationship to non-AComServ (non-NetCDF) WCS servers

  • data format(s)
    • many WCS clients don't understand NetCDF
  • Issue: protocol compatibility
    • might need to implement more optional features of WCS
  • standard compliance
    • will need a test suite for 1.1.2 (and manage to run it)

Real Data-to-WCS-Mapping structure

  • Data hub that exposes the data ==> Provider ==> WCS Service
  • Observation platform or network ==> Dataset ==> WCS Coverage
  • Observation parameter/variable ==> Parameter ==> WCS Field