Talk:Candidate Technical Topics

From Earth Science Information Partners (ESIP)

General

  • in order to persuade nodes to use server package, we need reference implementation(s) demonstrating that it works and is performant. Good news: we are pretty far along this road!

CF-API

Performance

  • non-compressed data preferred
  • many files vs. single file for queries
    • mapping: many files -> single identifier
      • some concerns that this might be too slow, will have to try and find a sensible balance
      • queries might get very large (the only natural limit is the dataset/identifier)
      • need to limit query size on server side (datafed browser: currently client side management)
   --- discussion: purpose: protect server from unintelligent clients
   --- suggestion: link query restrictions to user connection; registered users could get additional benefits like larger query sizes; email information about tools (API) that allows them to do more intelligent queries
      • query size estimator on server side?
      • client library to chunk queries

Common NetCDF Python Interface, NetCDF4

  • Kari cloned PyNIO interface for Windows, so no problem right now for cross platform development
  • solve other problems first, keep an eye open
  • NetCDF4 might make things more complicated if you try to use all features, might not be easily mappable to WCS concept

Delivery of other data formats, other input formats

  • need to map other formats to WCS and/or CF concept
  • differentiate between format (NetCDF) and convention (CF)
  • chain with WMS server for default views/previews

revision tracking of Datasets

  • always try to get current data when dealing with real time data, always expect your data to be old
  • would be nice to have WCS field for "last updated" date, same for NetCDF/CF (global attribute?)
    • we can make something up on our own for a start
    • try to propose that for CF (and WCS)

--- related issue: intelligent harvesting of updates for catalogues (GI-cat); add "modification_time" to GetCapabilities metadata and allow for "updated_since <date>" request in DescribeCoverage. Note: this may be beyond current WCS standard specification.

Delivery of Point Station data

(see also WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type)

  • put logic/processing into SQL database as much as possible (views, stored procedures, etc)
    • try to maintain unit tests for this
  • need to discuss in more detail how this can be served using WCS (Paul/NILU)

--- discussion: new OGC standard for netcdf/CF1.6 allowing for representation of point data. Different views on the same data depending on who (which client) wants to access data: geodynamical fluid view = WCS; detailed description of feature = WFS; sensor based view = SOS

--- really important workshop outcome will be to outline the architecture of a future AQ network: what are server capacities? what are the requirements for the clients? Do we need/want brokers that can translate requests (for example an SOS request into a WCS/WFS request) or should this become a part of the server?

--- netcdf output of station data: see CF-netCDF-ExtensionFor-netCDF Data Model

technical Access restrictions to WCS

  • HTTP Basic authentication
  • API key
  • does not have to be 100% secure, more about connecting with the users, knowing who they are and to establish an accepted way of accessing the data
  • firewall whitelisting might be an option for small user groups

--- discussion: this is about sharing and exposing data - hence, data that shall not be openly accessible should not be put on the network in the first place. Unfortunately, reality is more complex: some AQ data are restricted and we risk loosing a lot of (free) data if we can't also accomodate for at least some restricted data.

--- more important in the short-term is user tracking: might allow for more specific services (bandwidth etc.), alerts, guidance, etc. - also important to demonstrate service use and convince funding agencies.

Relationship with other Servers

  • write a wrapper to import data formats when needed

--- need to define which other servers we want to connect to.

WCS 2.0

  • more modular: core and extensions
  • potentially easier to use/implement?
  • proper CF-NetCDF extension coming

Processing Services

extended (Time) filtering

  • day of week, hour of day, day of month,... (including ranges)
  • describe non-standard features in capabilities document?
  • might be difficult to get into official standard?
  • does not/should not interfere with standard if you don't use it

time zone support

  • server should be able to reprocess time axis time zone according to user request
    • no option to do this on the user side as this will lead to too many mistakes made
    • should be relatively easy to simply return time axis using the time zone from the request