AQ Community server software
From Earth Science Information Partners (ESIP)
< Back to | Workshops | Air Quality Data Network
This sessions is standards and conventions | Implementation for gridded and station data | Development tools | Server performance
Issues re. the use of netCDF and other data formats
netCDF is standard format for multi-dimensional data. Cf-netCDF is used both as an archival format of grid data as well as a payload format for WCS queries.
- Issue: ambiguity and completeness of CF
- Issue: CF (udunits) time format not the same as ISO Time format (as used by WCS)
- Issue: geo-referencing (also see CF-ML discussion "the need to store lat/lon coordinates in a CF-compliant netCDF file")
- What is missing in CF?
- sever independent CF-API Package
- Issue: We should define a standard python interface (PyNIO, python-netcdf4, scipy.io.netcdf?)
- Issue: other ouput formats
- support fused into server or add-on concept (possibly using the public W*S/NetCDF interface)
- Delivery of (small) data sets in ASCII/csv format?
- Issue: Reading other gridded input data formats? (i.e. GRIB)
Data server performance issues/solutions
- Issue: especially big datasets take a long time to prepare for delivery (slicing/subsetting, etc.)
- direct streaming of datasets to the client could be part of the solution, click here for details
- generated datasets could be cached for a while, so they could be delivered again when there is a request with compatible parameters
- problem: both proposals might be mutually exclusive to some degree
- Issue: XML Metadata assembly might take a long time depending on the catalogue content, i.e. with a lot of Identifiers
- GetCapabilities response Metadata is very static anyway, other responses (DescribeCoverage) could be cached for a while
- attention: DescribeCoverage response depends on parameters
- GetCapabilities response Metadata is very static anyway, other responses (DescribeCoverage) could be cached for a while
- Issue: management overhead when opening NetCDF
- when opening a NetCDF file, some metadata has to be read and data structures have to be set up
- input files could be kept open for a while to avoid this overhead
- when opening a NetCDF file, some metadata has to be read and data structures have to be set up
- Issue: temp file space is limited on WCS server
- streaming approach for store=false parameter would not requrie additional local storage
- temp file approach for store=true parameter could be limited by a maximum dataset size
- requires a reliable output file size estimator
- server would return an exception if estimated size is over given threshold
- would force people to use store=false for large datasets
- should not violate WCS 1.1 standard (too badly) as only store=false is mandatory
Data Servers: Technical Realization (IT) Issues and Solutions
- which W*S protocol for which purpose, how to combine?
- WMS for display/preview of spatial data
- WFS .. for station description/spatial metadata?
- WCS for "everything else"? (gridded ("raw") datasets)
- WCS Data structure hierarchy: DataHub; Service; Coverage: Field; Flag
- WCS 1.1 terminology: Service->Group of similar datasets; Coverage->Dataset; Field->Parameter; Flag->Flag
Gridded data service through WCS
WCS is implemented in multiple versions: 1.0, 1.12, 2.0. The AQ Community Server (AComServ) is now implemented using WCS 1.1.2. This generally works well.
- Issue: Extraction of vertical levels
- Issue: current state of WCS 2.0?
- core relased, but extensions still in draft (how do we know what is valid?)
- Issue: serve "virtual" WCS datasets with continuous time line assembled from many source files
- create a "wrapper" module that can handle such cases?
- Kari has already done something like this for HTAP datasets, this could be a starting point
- Issue: desirable time filtering options in WCS: hour of day, day of week, day of month, etc.
- Kari has already created such filters, but so far they are outside the standard
Delivery of station-point data
- Issue: use WCS or WFS, Combination of both/which combination?
Access rights'
- Issue: technical options to restrict access to datasets?
Server co-development tools, methods
Server code is maintained through SourceForge, Darcs code repositories are available at WUSTL and in Juelich.
- Issues: Platform independence (netcdf interface), Documentation
Relationship to non-AComServ (non-NetCDF) WCS servers
- data format(s)
- many WCS clients don't understand NetCDF
- Issue: protocol compatibility
- might need to implement more optional features of WCS
- standard compliance
- will need a test suite for 1.1.2 (and manage to run it)
Real Data-to-WCS 1.1.2 Mapping
- Data hub that exposes the data ==> Provider ==> WCS Service
- Observation platform or network ==> Dataset ==> WCS Coverage
- Observation parameter/variable ==> Parameter ==> WCS Field