Difference between revisions of "Talk:Candidate Technical Topics"

From Earth Science Information Partners (ESIP)
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== -- [[User:MDecker|MDecker]] 10:03, 23 August 2011 (MDT) ==
+
==General==
 +
* in order to persuade nodes to use server package, we need reference implementation(s) demonstrating that it works and is performant. Good news: we are pretty far along this road!
  
=CF-API=
+
==CF-API==
 
* need to read and write CF-compliant files easily
 
* need to read and write CF-compliant files easily
** add a python interface to ucar libcf? http://www.unidata.ucar.edu/software/libcf/
+
** python interface for unidata libcf? http://www.unidata.ucar.edu/software/libcf/
  
=Performance/Virtual Datasets=
+
==Performance==
 
* non-compressed data preferred
 
* non-compressed data preferred
 
* many files vs. single file for queries
 
* many files vs. single file for queries
Line 12: Line 13:
 
*** queries might get very large (the only natural limit is the dataset/identifier)
 
*** queries might get very large (the only natural limit is the dataset/identifier)
 
*** need to limit query size on server side (datafed browser: currently client side management)
 
*** need to limit query size on server side (datafed browser: currently client side management)
 +
    --- discussion: purpose: protect server from unintelligent clients
 +
    --- suggestion: link query restrictions to user connection; registered users could get additional benefits like larger query sizes; email information about tools (API) that allows them to do more intelligent queries
 +
*** query size estimator on server side?
 +
*** client library to chunk queries
  
=Common NetCDF Python Interface, NetCDF4=
+
==Common NetCDF Python Interface, NetCDF4==
 
* Kari cloned PyNIO interface for Windows, so no problem right now for cross platform development
 
* Kari cloned PyNIO interface for Windows, so no problem right now for cross platform development
 
* solve other problems first, keep an eye open
 
* solve other problems first, keep an eye open
 
* NetCDF4 might make things more complicated if you try to use all features, might not be easily mappable to WCS concept
 
* NetCDF4 might make things more complicated if you try to use all features, might not be easily mappable to WCS concept
  
=Delivery of other data formats, other input formats=
+
==Delivery of other data formats, other input formats==
 
* need to map other formats to WCS and/or CF concept
 
* need to map other formats to WCS and/or CF concept
 
* differentiate between format (NetCDF) and convention (CF)
 
* differentiate between format (NetCDF) and convention (CF)
 
* chain with WMS server for default views/previews
 
* chain with WMS server for default views/previews
  
=Tracability and revision tracking of Datasets=
+
==revision tracking of Datasets==
 
* always try to get current data when dealing with real time data, always expect your data to be old
 
* always try to get current data when dealing with real time data, always expect your data to be old
 
* would be nice to have WCS field for "last updated" date, same for NetCDF/CF (global attribute?)
 
* would be nice to have WCS field for "last updated" date, same for NetCDF/CF (global attribute?)
Line 29: Line 34:
 
** try to propose that for CF (and WCS)
 
** try to propose that for CF (and WCS)
  
=Delivery of Point Station data=
+
--- related issue: intelligent harvesting of updates for catalogues (GI-cat); add "modification_time" to GetCapabilities metadata and allow for "updated_since <date>" request in DescribeCoverage. Note: this may be beyond current WCS standard specification.
* put config into SQL database as much as possible (views, stored procedures, etc)
+
 
 +
==Delivery of Point Station data==
 +
(see also [[WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type]])
 +
* put logic/processing into SQL database as much as possible (views, stored procedures, etc)
 
** try to maintain unit tests for this
 
** try to maintain unit tests for this
 
* need to discuss in more detail how this can be served using WCS (Paul/NILU)
 
* need to discuss in more detail how this can be served using WCS (Paul/NILU)
  
=technical Access restrictions to WCS=
+
--- discussion: new OGC standard for netcdf/CF1.6 allowing for representation of point data. Different views on the same data depending on who (which client) wants to access data: geodynamical fluid view = WCS; detailed description of feature = WFS; sensor based view = SOS
 +
 
 +
--- really important workshop outcome will be to outline the architecture of a future AQ network: what are server capacities? what are the requirements for the clients? Do we need/want brokers that can translate requests (for example an SOS request into a WCS/WFS request) or should this become a part of the server?
 +
 
 +
--- netcdf output of station data: see [[media:CF-netCDF-ExtensionFor-netCDF Data Model (22-Agu-2011).pdf|CF-netCDF-ExtensionFor-netCDF Data Model]]
 +
 
 +
==technical Access restrictions to WCS==
 
* HTTP Basic authentication
 
* HTTP Basic authentication
 
* API key
 
* API key
Line 40: Line 54:
 
* firewall whitelisting might be an option for small user groups
 
* firewall whitelisting might be an option for small user groups
  
=Relationship with other Servers=
+
--- discussion: this is about sharing and exposing data - hence, data that shall not be openly accessible should not be put on the network in the first place. Unfortunately, reality is more complex: some AQ data are restricted and we risk loosing a lot of (free) data if we can't also accomodate for at least some restricted data.
 +
 
 +
--- more important in the short-term is user tracking: might allow for more specific services (bandwidth etc.), alerts, guidance, etc. - also important to demonstrate service use and convince funding agencies.
 +
 
 +
==Relationship with other Servers==
 
* write a wrapper to import data formats when needed
 
* write a wrapper to import data formats when needed
 +
--- need to define which other servers we want to connect to.
  
=WCS 2.0=
+
==WCS 2.0==
 
* more modular: core and extensions
 
* more modular: core and extensions
 
* potentially easier to use/implement?
 
* potentially easier to use/implement?
 
* proper CF-NetCDF extension coming
 
* proper CF-NetCDF extension coming
  
=Processing Services=
+
==Processing Services==
* idea: community provides online processing service for their discipline, for example averaging
+
* idea: community provides online processing service for their discipline
** not part of W*S, but separate service
+
** not part of WCS, but separate service
** protocol: web processing service http://www.opengeospatial.org/standards/wps
+
** protocol: WPS (web processing service, http://www.opengeospatial.org/standards/wps)
  
=extended (Time) filtering=
+
==extended (Time) filtering==
* day of week, hour of day, day of month,...
+
* day of week, hour of day, day of month,... (including ranges)
 
* describe non-standard features in capabilities document?
 
* describe non-standard features in capabilities document?
 
* might be difficult to get into official standard?
 
* might be difficult to get into official standard?
 
* does not/should not interfere with standard if you don't use it
 
* does not/should not interfere with standard if you don't use it
 +
 +
==time zone support==
 +
* server should be able to reprocess time axis time zone according to user request
 +
** no option to do this on the user side as this will lead to too many mistakes made
 +
** should be relatively easy to simply return time axis using the time zone from the request

Latest revision as of 04:02, August 25, 2011

General

  • in order to persuade nodes to use server package, we need reference implementation(s) demonstrating that it works and is performant. Good news: we are pretty far along this road!

CF-API

Performance

  • non-compressed data preferred
  • many files vs. single file for queries
    • mapping: many files -> single identifier
      • some concerns that this might be too slow, will have to try and find a sensible balance
      • queries might get very large (the only natural limit is the dataset/identifier)
      • need to limit query size on server side (datafed browser: currently client side management)
   --- discussion: purpose: protect server from unintelligent clients
   --- suggestion: link query restrictions to user connection; registered users could get additional benefits like larger query sizes; email information about tools (API) that allows them to do more intelligent queries
      • query size estimator on server side?
      • client library to chunk queries

Common NetCDF Python Interface, NetCDF4

  • Kari cloned PyNIO interface for Windows, so no problem right now for cross platform development
  • solve other problems first, keep an eye open
  • NetCDF4 might make things more complicated if you try to use all features, might not be easily mappable to WCS concept

Delivery of other data formats, other input formats

  • need to map other formats to WCS and/or CF concept
  • differentiate between format (NetCDF) and convention (CF)
  • chain with WMS server for default views/previews

revision tracking of Datasets

  • always try to get current data when dealing with real time data, always expect your data to be old
  • would be nice to have WCS field for "last updated" date, same for NetCDF/CF (global attribute?)
    • we can make something up on our own for a start
    • try to propose that for CF (and WCS)

--- related issue: intelligent harvesting of updates for catalogues (GI-cat); add "modification_time" to GetCapabilities metadata and allow for "updated_since <date>" request in DescribeCoverage. Note: this may be beyond current WCS standard specification.

Delivery of Point Station data

(see also WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type)

  • put logic/processing into SQL database as much as possible (views, stored procedures, etc)
    • try to maintain unit tests for this
  • need to discuss in more detail how this can be served using WCS (Paul/NILU)

--- discussion: new OGC standard for netcdf/CF1.6 allowing for representation of point data. Different views on the same data depending on who (which client) wants to access data: geodynamical fluid view = WCS; detailed description of feature = WFS; sensor based view = SOS

--- really important workshop outcome will be to outline the architecture of a future AQ network: what are server capacities? what are the requirements for the clients? Do we need/want brokers that can translate requests (for example an SOS request into a WCS/WFS request) or should this become a part of the server?

--- netcdf output of station data: see CF-netCDF-ExtensionFor-netCDF Data Model

technical Access restrictions to WCS

  • HTTP Basic authentication
  • API key
  • does not have to be 100% secure, more about connecting with the users, knowing who they are and to establish an accepted way of accessing the data
  • firewall whitelisting might be an option for small user groups

--- discussion: this is about sharing and exposing data - hence, data that shall not be openly accessible should not be put on the network in the first place. Unfortunately, reality is more complex: some AQ data are restricted and we risk loosing a lot of (free) data if we can't also accomodate for at least some restricted data.

--- more important in the short-term is user tracking: might allow for more specific services (bandwidth etc.), alerts, guidance, etc. - also important to demonstrate service use and convince funding agencies.

Relationship with other Servers

  • write a wrapper to import data formats when needed

--- need to define which other servers we want to connect to.

WCS 2.0

  • more modular: core and extensions
  • potentially easier to use/implement?
  • proper CF-NetCDF extension coming

Processing Services

extended (Time) filtering

  • day of week, hour of day, day of month,... (including ranges)
  • describe non-standard features in capabilities document?
  • might be difficult to get into official standard?
  • does not/should not interfere with standard if you don't use it

time zone support

  • server should be able to reprocess time axis time zone according to user request
    • no option to do this on the user side as this will lead to too many mistakes made
    • should be relatively easy to simply return time axis using the time zone from the request