Candidate Technical Topics

From Earth Science Information Partners (ESIP)

< Back to AQ CoP.png | Workshops | Air Quality Data Network

Data Network: Context and Scope (Non-IT Issues)

What is the scope of the data network? Geographic, Variables?

What user types constitute the stakeholders in the network?

What organizations are stakeholders in the network?

What few things must be the same….? Autonomy Interop. balance

Data Network legitimacy, governance, impediments

Data Servers: Technical Realization (IT) Issues and Solutions

netCDF is standard format for multi-dimensional data. Cf-netCDF is used both as an archival format as well as a payload format for WCS queries. [Define here the python interface issue/question in about one sentence].

WCS is implemented in multiple versions: 1.0, 1.12, 2.0. The AQ Community Server (AComServ) is now implemented using WCS 1.1.2. [Define here the WCS version (WCS 2.0) issue in about one sentence]

Data display is through WMS. AQ data can be delivered through WCS, WFS. In AComServ, WCS for transferring ndim grid and point-station data; WFS for deliver monitoring station descriptions. [Define here issues related to this choice; ]

What are WCS issues/solutions for delivering grid data [This is a loaded questions…how to tackle it?]

What are WCS issues/solutions for serving station-point data? [This is a loaded questions…how to tackle it?]

What are WCS server performance issues/solutions? [Define performance issues, measurements

Server co-development tools, maintenance, version control, Documentation

Relationship to non-AComServ WCS servers

Data Network: Technical Realization (IT) Issues and Solutions

Network-level data flow, usage statistics (GoogleAnalytics), performance

What is or should be the functionality of an AQ Network catalog? Data granularity?

What is ‘interoperability’ for catalogs? Interoperability with whom? what standards are needed? CF Naming extensions?

What are the generic (ISO, GEOSS, INSPIRE) and the domain-specific discovery metadata for AQ?

Minimal metadata for data provenance, quality, access constrains?

Single AQ Catalog? Distributed? Service-oriented?

Data Network: Content

Data Scope: Ambient fixed station observations, satellite observations, emissions, models.

Data content: Primary & Secondary (storage);

Data Content: Raw, Derived;

Big Title separator

Selected Server Topics

Design Issues

  • WCS version 1.0, 1.1.2, 2.0?
  • Combining WMS, WFS, WCS?

Server for Different Data Types

  • Grid data (model, emiss., sat.)
  • Point-Station (surf. Netw.)
  • Other data types?

Server Maintenance-Support

  • SourceForge, Docum. Guides
  • Server code governance

Server Performance

  • Remote access or cache (??)
  • Extraction of vertical levels (how to implement in a useful and WCS 1.1/2.0 compliant way)
  • Streaming concept:
    • idea is to speed delivery of netcdf files by avoiding a local copy action before the download (if store=false)
    • requires separation of header creation and variable data (this is probably accomplished in C API but we are not sure about Python API)
    • would be nice to know about output file/stream size before creation (to show download progress/estimate download time etc.)
    • might not require "real streaming formats" such as ncstream (ncstream is a new format that differs from current netcdf 3/4, hence users would have to locally convert from ncstream to "normal" netcdf if they want a netcdf file in the end. Therefore, ncstream may be useful for specific client applications, but maybe not for general file download)
  • metadata Request caching (store response XML so that they are available for a subsequent request with same parameters)
  • File caching
    • input file caching: keep input files open (for a while) so that subsequent reads from them are quicker (no need to parse netCDF header again, etc)
      • this seems to have worked on windows but failed on linux, more work needed
  • output file caching:
    • if there is still a result file in the temp dir that fits the request, deliver that instead of generating a new one
      • might collide with the streaming approach - at least for store=false parameter
  • maybe limit maximum dataset size requested with store=true parameter to avoid excessive local copy operations on the server side
    • requires a reliable output file size estimator
    • server would return an exception if estimated size is over given threshold
    • would force people to use store=false for large datasets, so that data could be streamed without local copy
    • should not violate WCS 1.1 standard as only store=false is mandatory

Selected Network Topics

AQ Network Design Issues

  • Autonomy-interop. balance
  • Network Catalog(s)

AQ Community Catalog

  • Domain/Application Catalog(s)

Network Metadata Issues

  • Discovery metadata for AQ
  • Provenance, quality, security

Network Operation, Maintenance

  • Governance, Legitimacy

Selected Client Topics

Client Applications

  • Regulations/Directives
  • Air Quality/Composition. Science
  • Informing the public

Client Design Issues

  • Desktop vs web-based
  • Workflow? Mashups?

Community tools methods

  • Tools …
  • Etc etc