2008-05-07: Subcommittee on Interoperability Telecon Minutes

From Earth Science Information Partners (ESIP)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Back to < Subcommittee on Interoperability

Subcommittee on Interoperability

Ad Hoc Advisory Committee -- Air Quality Data Summit
May 7, 2008, 2:00 – 3:00pm EDT
Conference Room C351K
Call-in 919-541-1590

Conference Call Meeting Notes

Original Word doc by Nick Mangus.

COMMITTEE MEMBERS, Attendees 5/07/08 bolded:

Tim Dye / Steve Ludwig
Les Hook
Rudy Husar
David McCabe
Nick Mangus
Joe Tikvart
Steve Young

Rudy and David were not present on the call, so this document is not complete, comprehensive, or binding and does not represent consensus. Please comment.

NEXT CONFERENCE CALL:

Wednesday, May 21, 2008 from 2:00 – 3:00pm EDT. Conference calls are scheduled on a bi-weekly basis.

MAJOR DECISIONS / CONCURRENCES:

There was general agreement that WCS is a safe place to start for data standards.

ACTION ITEMS:

Creation of the following was identified as desirable: a table of data; a list of services; questions for EPA managers on types of data transfer; a demonstration of data transfer; and something easy to visualize.

COMMITTEE DISCUSSION:

Chair of Subcommittee: A chair is invited to step forward. Alternately, the subcommittee can operate on a basis of “self-governance”; this appears acceptable to all, for the present.

Charge for Subcommittee: This needs to be addressed in discussions and is still an open issue. We feel we have essentially the following charge from the larger group: decide on a standard that we can use to build a connection to flow data and then demonstrate that ability. We can then improve this ability by incrementally expanding the scope of what is transferred.

Here are some principles to observe

We should be compliant with OGC (the Open Geospatial Consortium) and GEO (Group on Earth Observations) standards; we should also coordinate to ensure we fit into future plans for the EN (environmental information Exchange Network).

The initial focus will be on air quality data, but we need to also consider metadata and standard names (for things like pollutants/parameters, units of measure, time, and station/site identifiers). The WMO (world meteorological organization) and GEO may be guideposts for this.

Summary of discussions

Tim Dye indicated that we should identify what we are going to “build”, as soon as possible, and then get-going on it. Regarding data standards, we should pick a standard to start with, and identify the next steps and action items. Web Coverage Service (WCS) standards are a good place to start.

Steve Young suggested keeping track of the Open Geospatial Consortium (OGC), making maximum use of their standards, and staying abreast of standards development. Jerry Johnston (OEI) is a good source to which questions can be addressed. We should begin to frame questions about standards issues. These might include metadata, data exchange, and making use of the infrastructure for exchange networks.

An example could be the sort of standards that exist for naming conventions, e.g. NAAQS, or universal conventions for naming international sites. We could leverage what the meteorological community has done with the World Meteorological Organization (WMO). However, it is unknown if WMO has any activities involving air quality related chemical compounds. Also, OEI is involved with “systems of registries”, e.g., CAS, IUPAC. There is also European work on thesauruses and semantics. NARSTO has used explicit abbreviations with explanations (CAS); there is a list of names on the NARSTO website. We need to separate topics, brainstorm a list of factors and standards, and list what we expect to be changed, e.g., station, modeling, raster data, and parameter names.

We did not consider a long list of possible standards and narrow it to one based on objective criteria. Instead, we relied on a heuristic method of discussing standards that appear to be making universal inroads (e.g., into GEO) and are supported by reputable organizations (W3C, OGC, OASIS) and have been widely accepted in the information technology and environmental monitoring communities. This led us to decide that the WCS (web coverage service from OGC) would be the first service we attempt to pilot. There was general agreement that everyone seems happy with WCS and that it is a safe place to start, e.g., units, nomenclature, etc. We can’t go wrong using WCS where it makes sense, but it may only be a subset of what needs to be considered.


This is not the end of the discussion, however, but just the beginning. We need to agree on: (1) a messaging exchange scheme (perhaps SOAP or KVP or HTTP/REST?), (2) a common definition of layer (that is, what if a model has 15 layers and 100 parameter, must you get it all), and (3) a payload structures and formats.

The list of payloads we want to consider is:

  1. NetCDF (from unidata) with CF metadata/conventions
  2. KML (or KMZ) (Keyhole Markup Language)
  3. CSV (comma separated values)
  4. EN compliant XML

Each of these would require more definition to before we can implement and we should probably pick one to begin with. For example, what would the CSV structure be – would there be minimum requirements for station or raster data?

Other points noted are:

  • AQS and AIRNOW need to better communicate on what we want to share; then a web service can be added.
  • RSIG uses WCS, but Louis Sweeny has expressed reservations.
  • Keyhole Markup Language (KML) adopted an open standard on format, not interoperability.
  • The files put out in KML format are related to service from Google Earth.
  • USGS is automatically updating earthquake data.

A related question involves the future of the National Environmental Information Exchange Network (NEIEN) and what is the next generation for this network? Also, it was noted that the Exchange Network Leadership Council (ENLC) has plans for an exchange network. Chris Clark (OEI) can help with tech issues on EIEN and on web services. It was suggested that Nick send Steve a note about what is needed and Steve will see that the note is forwarded to Chris. In addition, Linda Travers (OEI) and Chet Wayland (OAQPS) could be interested in the future of interoperability via ENLC. Their input on resource issues for the exchange of data should probably be sought. How can such a program be developed with limited resources? A more “meaty” proposal could help move this activity forward in OEI; for example, OAQPS could be put forward as an example user.

Discussion turned to payloads, e.g., the format for sending data through. What payloads get sent, etc? What is the “payload” if WCS is used? The KML format is widely used. CSV files can be used. XML is desired for the exchange network; file sizes are an issue, as is how to convert into Oracle. There is a need to engage Rudy Husar in this dialogue. David McCabe, also, has ideas that should be explored.

Other discussion:

  • CAP is an OASIS standard flagged by GEOSS. CAP can address a couple of items and can be used with ENVIROFLASH to push information down the chain.
  • RSS could be important as it is being used in EPA; the public affairs office and John Shirey are sources of information.
  • There is a need for a registry or discovery service for architecture; everyone in GEOSS is concerned with registry; we might be able to build off this.
  • Tom Scheitlin is another contact, concerning Environmental Geoweb Service; this can be leveraged.
  • There is a need to include metadata so everyone can find each other.

Demonstration

We also decided that a quick implementation to demonstrate the concept is important to success. Also, having a client that is easy to understand and use it important to show the value of common interoperability work. A spreadsheet with a macro, or a Google Earth implementation was considered.

Managers are concerned with “see / feel / touch”. We need something quick to show managers, an indication of how it works, and identification of what the benefits are. How do you make it tangible? Put it on a spread sheet and tie-up for the broader community. Target Google Earth. Think in service-oriented terms. Identify a client.

In summary, the following are desirable:

  • a table of data;
  • a list of services;
  • questions for EPA managers on types of data transfer;
  • a demonstration or something easy to visualize.

Taxonomies

The following areas were raised as an area for discussion but not any real time was spend on them. These notes are Nick’s attempt to start some lists.

It was suggested by the larger group that we make two lists: what we need to exchange and what methods we can use (how) to exchange this information. We did not discuss this much on the call, but here are a few ways of making these lists, as background.

What to exchange

There are two fundamental categories: data and metadata.

Metadata

There are two types of metadata: technical (like a grid size, file creation date, etc.) and business (like data source and data quality indicators, model run characteristics, or descriptions of data and how it was manipulated).

We briefly discussed a separate kind of metadata (“operational”?) to notify downstream users that something upstream has changed: the CAP (common alerting protocol) from OASIS, which GEO is investigating. Getting news about critical data events was something that participants at the Data Summit thought were important. Atom and RSS are also possibilities for this.

Another type of metadata discussed was that related to “discovery” of services. That is, for the ‘system of systems’ in the value chain, what data is available, from where, and how.

Data

The air quality data that is to be exchanged can also be classified in many ways.

There are measurements, aggregates (daily summaries, MSA summaries, etc.), events, method descriptions, etc.

Measurements can be broad in space: on a 3-D model grid or 2-D satellite field of view (raster data), or more limited in space to a path (lidar or mobile monitor) or point (stationary monitor).

Regarding time, measurements can also be a continuous (6 second, 5 minute, or 1 hour) series or discrete/instantaneous (including aggregates).