Difference between revisions of "Talk:Candidate Technical Topics"

From Earth Science Information Partners (ESIP)
Line 1: Line 1:
=CF-API=
+
==General==
 +
* in order to persuade nodes to use server package, we need reference implementation(s) demonstrating that it works and is performant. Good news: we are pretty far along this road!
 +
 
 +
==CF-API==
 
* need to read and write CF-compliant files easily
 
* need to read and write CF-compliant files easily
 
** python interface for unidata libcf? http://www.unidata.ucar.edu/software/libcf/
 
** python interface for unidata libcf? http://www.unidata.ucar.edu/software/libcf/
  
=Performance=
+
==Performance==
 
* non-compressed data preferred
 
* non-compressed data preferred
 
* many files vs. single file for queries
 
* many files vs. single file for queries
Line 15: Line 18:
 
*** client library to chunk queries
 
*** client library to chunk queries
  
=Common NetCDF Python Interface, NetCDF4=
+
==Common NetCDF Python Interface, NetCDF4==
 
* Kari cloned PyNIO interface for Windows, so no problem right now for cross platform development
 
* Kari cloned PyNIO interface for Windows, so no problem right now for cross platform development
 
* solve other problems first, keep an eye open
 
* solve other problems first, keep an eye open
 
* NetCDF4 might make things more complicated if you try to use all features, might not be easily mappable to WCS concept
 
* NetCDF4 might make things more complicated if you try to use all features, might not be easily mappable to WCS concept
  
=Delivery of other data formats, other input formats=
+
==Delivery of other data formats, other input formats==
 
* need to map other formats to WCS and/or CF concept
 
* need to map other formats to WCS and/or CF concept
 
* differentiate between format (NetCDF) and convention (CF)
 
* differentiate between format (NetCDF) and convention (CF)
 
* chain with WMS server for default views/previews
 
* chain with WMS server for default views/previews
  
=revision tracking of Datasets=
+
==revision tracking of Datasets==
 
* always try to get current data when dealing with real time data, always expect your data to be old
 
* always try to get current data when dealing with real time data, always expect your data to be old
 
* would be nice to have WCS field for "last updated" date, same for NetCDF/CF (global attribute?)
 
* would be nice to have WCS field for "last updated" date, same for NetCDF/CF (global attribute?)
Line 31: Line 34:
 
** try to propose that for CF (and WCS)
 
** try to propose that for CF (and WCS)
  
=Delivery of Point Station data=
+
==Delivery of Point Station data==
 
(see also [[WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type]])
 
(see also [[WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type]])
 
* put logic/processing into SQL database as much as possible (views, stored procedures, etc)
 
* put logic/processing into SQL database as much as possible (views, stored procedures, etc)
 
** try to maintain unit tests for this
 
** try to maintain unit tests for this
 
* need to discuss in more detail how this can be served using WCS (Paul/NILU)
 
* need to discuss in more detail how this can be served using WCS (Paul/NILU)
 +
--- discussion: new OGC standard for netcdf/CF1.6 allowing for representation of point data. Different views on the same data depending on who (which client) wants to access data: geodynamical fluid view = WCS; detailed description of feature = WFS; sensor based view = SOS
 +
--- really important workshop outcome will be to outline the architecture of a future AQ network: what are server capacities? what are the requirements for the clients? Do we need/want brokers that can translate requests (for example an SOS request into a WCS/WFS request) or should this become a part of the server?
  
=technical Access restrictions to WCS=
+
==technical Access restrictions to WCS==
 
* HTTP Basic authentication
 
* HTTP Basic authentication
 
* API key
 
* API key
 
* does not have to be 100% secure, more about connecting with the users, knowing who they are and to establish an accepted way of accessing the data
 
* does not have to be 100% secure, more about connecting with the users, knowing who they are and to establish an accepted way of accessing the data
 
* firewall whitelisting might be an option for small user groups
 
* firewall whitelisting might be an option for small user groups
 +
--- discussion: this is about sharing and exposing data - hence, data that shall not be openly accessible should not be put on the network in the first place. Unfortunately, reality is more complex: some AQ data are restricted and we risk loosing a lot of (free) data if we can't also accomodate for at least some restricted data.
 +
--- more important in the short-term is user tracking: might allow for more specific services (bandwidth etc.), alerts, guidance, etc. - also important to demonstrate service use and convince funding agencies.
  
=Relationship with other Servers=
+
==Relationship with other Servers==
 
* write a wrapper to import data formats when needed
 
* write a wrapper to import data formats when needed
 +
--- need to define which other servers we want to connect to.
  
=WCS 2.0=
+
==WCS 2.0==
 
* more modular: core and extensions
 
* more modular: core and extensions
 
* potentially easier to use/implement?
 
* potentially easier to use/implement?
 
* proper CF-NetCDF extension coming
 
* proper CF-NetCDF extension coming
  
=Processing Services=
+
==Processing Services==
 
* idea: community provides online processing service for their discipline
 
* idea: community provides online processing service for their discipline
 
** not part of WCS, but separate service
 
** not part of WCS, but separate service
 
** protocol: WPS (web processing service, http://www.opengeospatial.org/standards/wps)
 
** protocol: WPS (web processing service, http://www.opengeospatial.org/standards/wps)
  
=extended (Time) filtering=
+
==extended (Time) filtering==
* day of week, hour of day, day of month,...
+
* day of week, hour of day, day of month,... (including ranges)
 
* describe non-standard features in capabilities document?
 
* describe non-standard features in capabilities document?
 
* might be difficult to get into official standard?
 
* might be difficult to get into official standard?
 
* does not/should not interfere with standard if you don't use it
 
* does not/should not interfere with standard if you don't use it

Revision as of 02:20, August 24, 2011

General

  • in order to persuade nodes to use server package, we need reference implementation(s) demonstrating that it works and is performant. Good news: we are pretty far along this road!

CF-API

Performance

  • non-compressed data preferred
  • many files vs. single file for queries
    • mapping: many files -> single identifier
      • some concerns that this might be too slow, will have to try and find a sensible balance
      • queries might get very large (the only natural limit is the dataset/identifier)
      • need to limit query size on server side (datafed browser: currently client side management)
   --- discussion: purpose: protect server from unintelligent clients
   --- suggestion: link query restrictions to user connection; registered users could get additional benefits like larger query sizes; email information about tools (API) that allows them to do more intelligent queries
      • query size estimator on server side?
      • client library to chunk queries

Common NetCDF Python Interface, NetCDF4

  • Kari cloned PyNIO interface for Windows, so no problem right now for cross platform development
  • solve other problems first, keep an eye open
  • NetCDF4 might make things more complicated if you try to use all features, might not be easily mappable to WCS concept

Delivery of other data formats, other input formats

  • need to map other formats to WCS and/or CF concept
  • differentiate between format (NetCDF) and convention (CF)
  • chain with WMS server for default views/previews

revision tracking of Datasets

  • always try to get current data when dealing with real time data, always expect your data to be old
  • would be nice to have WCS field for "last updated" date, same for NetCDF/CF (global attribute?)
    • we can make something up on our own for a start
    • try to propose that for CF (and WCS)

Delivery of Point Station data

(see also WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type)

  • put logic/processing into SQL database as much as possible (views, stored procedures, etc)
    • try to maintain unit tests for this
  • need to discuss in more detail how this can be served using WCS (Paul/NILU)

--- discussion: new OGC standard for netcdf/CF1.6 allowing for representation of point data. Different views on the same data depending on who (which client) wants to access data: geodynamical fluid view = WCS; detailed description of feature = WFS; sensor based view = SOS --- really important workshop outcome will be to outline the architecture of a future AQ network: what are server capacities? what are the requirements for the clients? Do we need/want brokers that can translate requests (for example an SOS request into a WCS/WFS request) or should this become a part of the server?

technical Access restrictions to WCS

  • HTTP Basic authentication
  • API key
  • does not have to be 100% secure, more about connecting with the users, knowing who they are and to establish an accepted way of accessing the data
  • firewall whitelisting might be an option for small user groups

--- discussion: this is about sharing and exposing data - hence, data that shall not be openly accessible should not be put on the network in the first place. Unfortunately, reality is more complex: some AQ data are restricted and we risk loosing a lot of (free) data if we can't also accomodate for at least some restricted data. --- more important in the short-term is user tracking: might allow for more specific services (bandwidth etc.), alerts, guidance, etc. - also important to demonstrate service use and convince funding agencies.

Relationship with other Servers

  • write a wrapper to import data formats when needed

--- need to define which other servers we want to connect to.

WCS 2.0

  • more modular: core and extensions
  • potentially easier to use/implement?
  • proper CF-NetCDF extension coming

Processing Services

extended (Time) filtering

  • day of week, hour of day, day of month,... (including ranges)
  • describe non-standard features in capabilities document?
  • might be difficult to get into official standard?
  • does not/should not interfere with standard if you don't use it