Difference between revisions of "Subcommittee on Interoperability"

From Earth Science Information Partners (ESIP)
Line 76: Line 76:
 
* RSIG uses WCS
 
* RSIG uses WCS
 
* Louis Sweeny expressed reservations about the extensibility and compatibility of WCS as a general purpose data transport protocol, since it was originaly designed for GIS coverages,  but agreed that its what we have so its the place to start.
 
* Louis Sweeny expressed reservations about the extensibility and compatibility of WCS as a general purpose data transport protocol, since it was originaly designed for GIS coverages,  but agreed that its what we have so its the place to start.
  * Keyhole Markup Language (KML) adopted an open standard on format, not interoperability.
+
* Keyhole Markup Language (KML) adopted an open standard on format, not interoperability.
 
* The files put out in KML format are related to service from Google Earth.
 
* The files put out in KML format are related to service from Google Earth.
 
* USGS is automatically updating earthquake data.
 
* USGS is automatically updating earthquake data.

Revision as of 23:59, June 25, 2008

Back to <Data Summit Workspace
Back to <Community Air Quality Data System Workspace
Back to <Interoperability of Air Quality Data Systems

Subcommittee Telecons

Charge for Subcommittee:

Decide on a standard that we can use to build a connection to flow data and then demonstrate that ability. We can then improve this ability by incrementally expanding the scope of what is transferred.

Guiding Principles:

We should be compliant with OGC (the Open Geospatial Consortium) and GEO (Group on Earth Observations) standards; we should also coordinate to ensure we fit into future plans for the EN (environmental information Exchange Network).

The initial focus will be on air quality data, but we need to also consider metadata and standard names (for things like pollutants/parameters, units of measure, time, and station/site identifiers). The WMO (world meteorological organization) and GEO may be guideposts for this.

In order to facilitate the work of the Interoperability Subcommittee, a wiki workspace was set up on the topic of Interoperability of Data systems. This workspace is on the ESIP wiki and will be used to accommodate inter-agency and inter-disciplinary participation.

WCS

We relied on a heuristic method of discussing standards that appear to be making universal inroads (e.g., into GEO) and are supported by reputable organizations (W3C, OGC, OASIS) and have been widely accepted in the information technology and environmental monitoring communities. This led us to decide that the WCS (web coverage service from OGC) would be the first service we attempt to pilot. There was general agreement that everyone seems happy with WCS and that it is a safe place to start, e.g., units, nomenclature, etc., however it may only be a subset of what needs to be considered.

We need to agree on:

  1. a messaging exchange scheme (perhaps SOAP or KVP or HTTP/REST?)
  2. a common definition of layer (that is, what if a model has 15 layers and 100 parameter, must you get it all)
  3. a payload structures and formats.

Messaging Exchange Scheme

Using WCS, the call is one string, so value pairs are the only option.

Common Definition of Layer

A layer is known as coverage and describes one parameter of one dataset.

Payloads - Format for sending Data

The list of payloads we want to consider is:

  1. NetCDF (from unidata) with CF metadata/conventions
  2. KML (or KMZ) (Keyhole Markup Language)
  3. CSV (comma separated values)
  4. EN compliant XML

Bolded options are seen as preferred payload formats for WCS. netCDF has more advantages, but CSV is simpler to use.

Each of these would require more definition to before we can implement and we should probably pick one to begin with. For example, what would the CSV structure be – would there be minimum requirements for station or raster data?

What payloads get sent, etc? What is the “payload” if WCS is used? The KML format is widely used. CSV files can be used. XML is desired for the exchange network; file sizes are an issue, as is how to convert into Oracle. There is a need to engage Rudy Husar in this dialogue. David McCabe, also, has ideas that should be explored.

Classification of Air Quality Data

The air quality data that is to be exchanged can also be classified in many ways.

There are measurements, aggregates (daily summaries, MSA summaries, etc.), events, method descriptions, etc.

Measurements can be broad in space: on a 3-D model grid or 2-D satellite field of view (raster data), or more limited in space to a path (lidar or mobile monitor) or point (stationary monitor).

Regarding time, measurements can also be a continuous (6 second, 5 minute, or 1 hour) series or discrete/instantaneous (including aggregates).


Metadata

There are two types of metadata: technical (like a grid size, file creation date, etc.) and business (like data source and data quality indicators, model run characteristics, or descriptions of data and how it was manipulated).

We briefly discussed a separate kind of metadata (“operational”?) to notify downstream users that something upstream has changed: the CAP (common alerting protocol) from OASIS, which GEO is investigating. Getting news about critical data events was something that participants at the Data Summit thought were important. Atom and RSS are also possibilities for this.

Another type of metadata discussed was that related to “discovery” of services. That is, for the ‘system of systems’ in the value chain, what data is available, from where, and how.

WCS Tools and Resources

What other materials on interoperability should be collected?

Need a WCS tutorial we could use/modify?

Current State of Data Flow and Interoperability between the 'Core' Data Systems

  • AQS and AIRNOW need to better communicate on what we want to share; then a web service can be added.
  • RSIG uses WCS
  • Louis Sweeny expressed reservations about the extensibility and compatibility of WCS as a general purpose data transport protocol, since it was originaly designed for GIS coverages, but agreed that its what we have so its the place to start.
  • Keyhole Markup Language (KML) adopted an open standard on format, not interoperability.
  • The files put out in KML format are related to service from Google Earth.
  • USGS is automatically updating earthquake data.

A related question involves the future of the National Environmental Information Exchange Network (NEIEN) and what is the next generation for this network? Also, it was noted that the Exchange Network Leadership Council (ENLC) has plans for an exchange network. Chris Clark (OEI) can help with tech issues on EIEN and on web services. It was suggested that Nick send Steve a note about what is needed and Steve will see that the note is forwarded to Chris. In addition, Linda Travers (OEI) and Chet Wayland (OAQPS) could be interested in the future of interoperability via ENLC. Their input on resource issues for the exchange of data should probably be sought. How can such a program be developed with limited resources? A more “meaty” proposal could help move this activity forward in OEI; for example, OAQPS could be put forward as an example user.

Possible Interop tests/documentation among the core network nodes(?)

A quick implementation to demonstrate the concept is important to success. Also, having a client that is easy to understand and use it important to show the value of common interoperability work. A spreadsheet with a macro, or a Google Earth implementation was considered.

Managers are concerned with “see / feel / touch”. We need something quick to show managers, an indication of how it works, and identification of what the benefits are. How do you make it tangible? Put it on a spread sheet and tie-up for the broader community. Target Google Earth. Think in service-oriented terms. Identify a client.

The following are desirable:

  • a table of data;
  • a list of services;
  • questions for EPA managers on types of data transfer;
  • a demonstration or something easy to visualize.

The following EPA-affiliated systems currently provide OGC-WCS for multiple kinds of data:

 * NASA MODIS mod04/6/7 (AOD, COT, Ozone, etc.)
   forwarded from lpweb.nascom.nasa.gov/cgi-bin/modisserver
 * NASA CALIPSO LIDAR (Backscatter)
 * NESDIS-GOES_Biomass-Burning (CO, PM25, etc.)
 * EPA CMAQ (Met & AQ)
 * EPA AIRNow (GMT-hourly Ozone, PM25)
   using datafed then adding capabilities such as regridding to CMAQ
 * EPA AQS Datamart (GMT-hourly Ozone, PM25 plus average, maxes)
 * UVNet (irradiance)
 Note rsigserver is also an OGC-WMS serving images and animations (PNG, MPEG, KMZ) to EGS.

(on a 24-hour delay).

 Note Datafed also has OGC-WMS.
  • EPA AIRNow (via above Datafed & RSIG)
  • EPA AQS (via above RSIG)

There are several data systems affiliated with EPA, which could be made interoperable using the WCS OGC Standard Protocol. <ask format="ul" limit="100" > +</ask>

  • EIS

AQS and Airnow are going to pilot the WCS interface.

How about moving RSIG's OGC-WCS airnowserver to a public SonomaTech computer to remove the 24-hour delay in accessing files from Datafed's AIRNow WCS?

How about, in the long-term, having all of the data providers for small site data send their data into the AQS Datamart (on an hourly or daily basis) so their data could become OGC-WCS-accessible via the existing rsigserver (which already has access to the AQS Datamart DB and could likely be quickly modified to handle this additional data)?


Following the group recommendation at the March 12 telecon, it was recommended that a subcommittee be formed on interoperability of data systems to address the diversity of interoperable data standards and to make recommendations. Several volunteers agreed to participate, including: David McCabe, Steve Young, Nick Mangus, Tim Dye, and Rudy Husar. The interested data systems should monitor the activities of the interoperability group. The initial activities of the group should include:

  1. Identify interoperability standards and methods,
  2. Test and apply these standards to several EPA data systems
  3. Apply GEO principles and architecture and ESIP venues and community