AQ Community server software

From Earth Science Information Partners (ESIP)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

< Back to AQ CoP.png | Workshops | Air Quality Data Network

This sessions is standards and conventions | Implementation for gridded and station data | Development tools | Server performance

Issues re. the use of netCDF and other data formats

netCDF is standard format for multi-dimensional data. Cf-netCDF is used both as an archival format of grid data as well as a payload format for WCS queries.

  • Issue: ambiguity and completeness of CF
    • Issue: CF (udunits) time format not the same as ISO Time format (as used by WCS)
    • Issue: geo-referencing (also see CF-ML discussion "the need to store lat/lon coordinates in a CF-compliant netCDF file")
    • What is missing in CF?
    • sever independent CF-API Package
  • Issue: We should define a standard python interface (PyNIO, python-netcdf4, scipy.io.netcdf?)
  • Issue: other ouput formats
    • support fused into server or add-on concept (possibly using the public W*S/NetCDF interface)
    • Delivery of (small) data sets in ASCII/csv format?
  • Issue: Reading other gridded input data formats? (i.e. GRIB)

Data server performance issues/solutions

  • Issue: especially big datasets take a long time to prepare for delivery (slicing/subsetting, etc.)
    • direct streaming of datasets to the client could be part of the solution, click here for details
    • generated datasets could be cached for a while, so they could be delivered again when there is a request with compatible parameters
    • problem: both proposals might be mutually exclusive to some degree
  • Issue: XML Metadata assembly might take a long time depending on the catalogue content, i.e. with a lot of Identifiers
    • GetCapabilities response Metadata is very static anyway, other responses (DescribeCoverage) could be cached for a while
      • attention: DescribeCoverage response depends on parameters
  • Issue: management overhead when opening NetCDF
    • when opening a NetCDF file, some metadata has to be read and data structures have to be set up
      • input files could be kept open for a while to avoid this overhead
  • Issue: temp file space is limited on WCS server
    • streaming approach for store=false parameter would not requrie additional local storage
    • temp file approach for store=true parameter could be limited by a maximum dataset size
      • requires a reliable output file size estimator
      • server would return an exception if estimated size is over given threshold
      • would force people to use store=false for large datasets
      • should not violate WCS 1.1 standard (too badly) as only store=false is mandatory

Data Servers: Technical Realization (IT) Issues and Solutions

  • which W*S protocol for which purpose, how to combine?
    • WMS for display/preview of spatial data
    • WFS .. for station description/spatial metadata?
    • WCS for "everything else"? (gridded ("raw") datasets)
  • WCS Data structure hierarchy: DataHub; Service; Coverage: Field; Flag
    • WCS 1.1 terminology: Service->Group of similar datasets; Coverage->Dataset; Field->Parameter; Flag->Flag

Gridded data service through WCS

WCS is implemented in multiple versions: 1.0, 1.12, 2.0. The AQ Community Server (AComServ) is now implemented using WCS 1.1.2. This generally works well.

  • Issue: Extraction of vertical levels
  • Issue: current state of WCS 2.0?
    • core relased, but extensions still in draft (how do we know what is valid?)
  • Issue: serve "virtual" WCS datasets with continuous time line assembled from many source files
    • create a "wrapper" module that can handle such cases?
    • Kari has already done something like this for HTAP datasets, this could be a starting point
  • Issue: desirable time filtering options in WCS: hour of day, day of week, day of month, etc.
    • Kari has already created such filters, but so far they are outside the standard

Delivery of station-point data

  • Issue: use WCS or WFS, Combination of both/which combination?

Access rights'

  • Issue: technical options to restrict access to datasets?

Server co-development tools, methods

Server code is maintained through SourceForge, Darcs code repositories are available at WUSTL and in Juelich.

  • Issues: Platform independence (netcdf interface), Documentation

Relationship to non-AComServ (non-NetCDF) WCS servers

  • data format(s)
    • many WCS clients don't understand NetCDF
  • Issue: protocol compatibility
    • might need to implement more optional features of WCS
  • standard compliance
    • will need a test suite for 1.1.2 (and manage to run it)

Real Data-to-WCS 1.1.2 Mapping

  • Data hub that exposes the data ==> Provider ==> WCS Service
  • Observation platform or network ==> Dataset ==> WCS Coverage
  • Observation parameter/variable ==> Parameter ==> WCS Field