Difference between revisions of "Talk:2011-06-13: Multiple versions of data"

From Earth Science Information Partners (ESIP)
 
(31 intermediate revisions by the same user not shown)
Line 1: Line 1:
== ACP WCS services -- [[User:Rhusar|Rhusar]] 14:27, 13 June 2011 (MDT) ==
+
== ACP WCS services -- [[User:Sfalke|Sfalke]] 9 June 2011 (MDT) ==
  
Martin,
+
Martin, Here are the items we said we'd point you to - the ACP Viewer and WCSes with satellite derived gaseous columns:
Thanks very much for the links. Here are the items we said we'd point you to - the ACP Viewer and WCSes with satellite derived gaseous columns:
 
  
 +
* [http://wdc.dlr.de/acp/map_viewer2/indexW.html ACP Map Viewer]
 +
* [http://wdc.dlr.de/cgi-bin/gome2_l3.cgi?Service=WCS&Version=1.0.0&Request=GetCapabilities GOME-2 O3, NO2 (total), NO2 (tropo)-DLR]
 +
* [http://acdisc.sci.gsfc.nasa.gov/daac-bin/wcsL3?service=wcs&request=getCapabilities&version=1.0.0 OMI NO2 (total), NO2 (tropo), O3 (and some other parameters) - NASA]
 +
* [http://acdisc.sci.gsfc.nasa.gov/daac-bin/wcsAIRSL3?service=wcs&request=getCapabilities&version=1.0.0 AIRS CO (and many other parameters) - NASA]
 +
* [http://data1.datafed.net:8080/ACDISC?service=WCS&acceptversions=1.1.2&Request=GetCapabilities OMI NO2 (total), NO2 (tropo), SO2 - DataFed]
 +
* [http://viper.cira.colostate.edu:8080/gsfc?service=WCS&acceptversions=1.1.2&Request=GetCapabilities OMI NO2 (total), NO2 (tropo), O3 - CIRA]<br>
 +
The last WCS is new to me - found it through uFind. Rudy can probably give background.
 +
 +
We look forward to your feedback and follow-on discussions on how well these services and
 +
their data meet your needs and any suggestions on how they might be made more useful. Stefan
 +
 +
===Re: ACP WCS services -- [[User:Clynnes|Clynnes]] 10 June 2011 (MDT)===
 +
Stefan, Are you sure CIRA is serving Tropospheric NO2? I only saw two coverages in the
 +
Capabilities doc, total column NO2 and total column O3. Chris
 +
 +
====Re: Re: ACP WCS services -- [[User:Sfalke|Sfalke]] 10 June 2011 (MDT)====
 +
Yes, but it's not listed in the GetCapabilities. The DataFed WCS is similar. They are set up
 +
as additional dimension fields. We'd need to have Rudy or Kari explain the details. You can
 +
see them when they are displayed in the DataFed browser.[http://webapps.datafed.net/datafed.aspx?wcs=http://viper.cira.colostate.edu:8080/gsfc&coverage=ColumnAmountNO2&field=NO2Trop for example] Stefan
 +
 +
=====Re: Re: Re: ACP WCS services -- [[User:Wyang|Wyang]] 10 June 2011 (MDT)=====
 +
This service uses WCS v1.1.2 where multiple fields can be included in one
 +
coverage. The following fields are served:<br>
 +
Colum Amount NO2 CS30<br>
 +
Colum Amount NO2 Troposphere CS30<br>
 +
CloudFraction<br>
 +
ColumnAmountO3<br> Wenli
  
* [http://wdc.dlr.de/acp/map_viewer2/indexW.html ACP Map Viewer]
+
======Re: Re: Re: Re: ACP WCS services -- [[User:Rhusar|Rhusar]] 10 June 2011 (MDT)======
 +
Wenli,
 +
Thanks for pointing out that we use WCS version 1.1.2. We find that Version 1.1 offers a
 +
natural mapping for data structures used in air qulaity:<br>
 +
Coverage >>> maps to >>> Dataset, i.e. a set of homogeneous observations that share
 +
space/time dimensions. It may be a specific model, obs network etc.<br>
 +
Field >>> maps to >>> Obs or model Parameter, i.e. AOT, ColumnNO2, SurfaceSo4
 +
----
 +
CIRA has a NASA Applications grant to bring satellite data to their AQ users.. mostly
 +
the Visibility regulations group that uses the VIEWS Decision Support System. . CIRA gets
 +
the 'raw' OMI point data from NASA ACDISC as does DataFed: (1) download the daily files;
 +
(2) append it to the cumulative CF-netcdf file and (3) place the DataFed/Juelich WCS server
 +
on top of the big cumulative netCDF file.
 +
 
 +
The [http://wiki.esipfed.org/index.php/WCS_Server_Software#Hosting_the_Open-Source_WCS_Server_Project_at_SourceForge open-source WCS 1.1.2 server] was installed at CIRA to serve both the station-point data from their SQL server as well as the CF-netCDF gridded data for models, satellites and emissions. DataFed/Juelich WCS server is also running at Stafan's place at Norhtrop, a second server at WashU, at NILU in Norway and at CIRA. These severs offer multiple WCS services for different kinds of data. The WCS server package is being considered for the delivery of (near real time) AQ data at the European Environmental Agency, Environment Canada and possibly at AIRnow.
 +
 
 +
The key goal of the [http://wiki.esipfed.org/index.php/HTAP_Data_Network:_Background,_Status AQ network] is to preset to the AQ users a consistent, standard interface
 +
to all data. These are the satellite data that are in the [http://webapps.datafed.net/Core.uFIND? CORE WCS network]. Given the homogeneity of these offered distributed WCS services, it is then '''possible to create universal clients''' that access these services. The semantics of the offered data, i.e. the different aggregations an filters that are applied by a specific server may still be different for each server, e.g. the OMI columnar NO2 data offered by CIRA is a different flavor from the DataFed or from the NASA DISC WCS version.
 +
 
 +
=======Re: Re: Re: Re: Re: ACP WCS services -- [[User:Wyang|Wyang]] 10 June 2011 (MDT)=======
 +
Rudy,
 +
I agree. I think that the biggest improvement in v1.1 over v1.0 is its enablement
 +
of multi-field coverage, which is the common data model used in EO communities
 +
(e.g., netCDF and HDF data products). However, this version had not been
 +
widely implemented before OGC adopted a newer modular based specification
 +
design (and now a newer WCS2.0 spec is produced and is in testing phase).
 +
Wenli
 +
 
 +
========Re: Re: Re: Re: Re: Re: ACP WCS services -- [[User:Rhusar|Rhusar]] 17:46, 13 June 2011 (MDT)========
 +
Wenli, the way things are, the number of WCS 1.0 services for air quality are also quite limited. Also, I think that the coverage/field structure of WCS 1.1 should migrate more smoothly to WCS 2.0
 +
 
 +
== Distinguishing OMI data at different servers -- [[User:Clynnes|Clynnes]] 10 June 2011 (MDT) ==
 +
 
 +
Rudy,
 +
How does a WCS client distinguish between the OMI data served from our WCS server
 +
and the data served from CIRA? That is, how would a universal client make a decision on
 +
which one to use? Or is there no distinction between the two? Chris
 +
 
 +
===Re: Distinguishing OMI data at different servers -- [[User:Mschultz|Mschultz]] 10 June 2011 (MDT)===
 +
Hi Christopher,
 +
a very important point indeed! While we are technically still working on just bringing things
 +
together, I fully agree with you that traceability of data sets will be a key issue as soon as we
 +
"go live". We should make a great effort to avoid duplication of data sets and establish some
 +
sort of catalogue which data sets are hosted on which server. Also, we must discuss the
 +
metadata model(s) and how the tracing information can be honored by the server software.
 +
As an example take the granule versus global field products from a satellite: it may be that
 +
one site (say NASA) serves the granules, while another site (for example DataFed) will use
 +
these data and process it to global fields which can then be used to compare them with
 +
surface obs or model results. There should be a way to "click" on the global fields to find out
 +
that they are derived from those NASA granule sets. Also, there should be a way to
 +
communicate changes to the granules from NASA to DataFed so that the global product can
 +
be re-generated in case of version changes. This shall definitively be a topic for discussion
 +
at the Solta meeting! [Rudy: please add to the agenda]
 +
Best regards,
 +
Martin
  
* [http://wdc.dlr.de/cgi-bin/gome2_l3.cgi?Service=WCS&Version=1.0.0&Request=GetCapabilities GOME-2 O3, NO2 (total), NO2 (tropo)-DLR]
+
===Re: Distinguishing OMI data at different servers -- [[User:Rhusar|Rhusar]] 10 June 2011 (MDT)===
 +
Chris,
 +
The two OMI NO2 data versions differ in their Discovery Metadata, specifically having
 +
different '''Distributor'''....in [http://webapps.datafed.net/CORE.uFIND uFIND].
 +
Of course, the service endpoints for the two NO2 data are different:<br>
 +
http://viper.cira.colostate.edu:8080/gsfc&coverage=ColumnAmountNO2&field=NO2Trop<br>
 +
http://data1.datafed.net:8080/ACDISC&coverage=OMNO2&field=NO2Trop<br>
 +
R
  
* [http://acdisc.sci.gsfc.nasa.gov/daac-bin/wcsL3?service=wcs&request=getCapabilities&version=1.0.0 OMI NO2 (total), NO2 (tropo), O3 (and some other parameters) - NASA]
+
====Re: Re: Distinguishing OMI data at different servers -- [[User:Clynnes|Clynnes]] 10 June 2011 (MDT)====
 +
Rudy,That means that we have 3 versions in the wild, not 2, if you include our GES DISC WCS
 +
server. Chris
  
* [http://acdisc.sci.gsfc.nasa.gov/daac-bin/wcsAIRSL3?service=wcs&request=getCapabilities&version=1.0.0 AIRS CO (and many other parameters) - NASA]
+
=====Re: Re: Re: Distinguishing OMI data at different servers -- [[User:Rhusar|Rhusar]] 10 June 2011 (MDT)=====
 +
Yup, there are multiple OMI NO2 versions out there, each having different value-adding
 +
features of the service and mutations of the processed data.
 +
Is having different versions good or good or bad? It depends..what you want to do.. For
 +
research might as well get the data out to the science wilderness and let it mutate until it becomes robust enough to survive the tooth of time. Rudy
  
* [http://data1.datafed.net:8080/ACDISC?service=WCS&acceptversions=1.1.2&Request=GetCapabilities OMI NO2 (total), NO2 (tropo), SO2 - DataFed]
+
======Re: Re: Re: Re: Distinguishing OMI data at different servers -- [[User:Mschultz|Mschultz]] 10 June 2011 (MDT)======
 +
Rudy,
 +
No - I strongly disagree here! This means that the choice of the dataset will in practice be
 +
determined by the wrong criteria (accessibility, data format, access speed, etc.) and not by
 +
the scientific robustness! There have been very tough discussions on data set quality in the
 +
GAW programme for example and just yesterday I learned about a nice project in Germany
 +
where dataset quality and a review process for "publication" of data are investigated. I
 +
strongly believe that any data set "out in the wild" should be traceable back to the originator
 +
and all modifications must be documented.
 +
Cheers,
 +
Martin
  
* [http://viper.cira.colostate.edu:8080/gsfc?service=WCS&acceptversions=1.1.2&Request=GetCapabilities OMI NO2 (total), NO2 (tropo), O3 - CIRA]<br> this WCS is new to me - found it through uFind.Rudy can probably give background
+
=======Re: Re: Re: Re: Re: Distinguishing OMI data at different servers -- [[User:Gleptukh|Gleptukh]] 10 June 2011 (MDT)=======
 +
I fully agree with Martin here.
 +
The quality of the original data is not going to improve by mutating and mating the incorrect
 +
or versions of the data.
 +
Provenance or, at least, attribution is paramount.
 +
To look positively, we have reached the point when technology progress is pushing us to
 +
force not only providers and brokers, but also clients to deal with attribution.
 +
Thanks, Greg
  
We look forward to your feedback and follow-on discussions on how well these services and
+
======Re: Re: Re: Re: Distinguishing OMI data at different servers -- [[User:Clynnes|Clynnes]] 10 June 2011 (MDT)======
their data meet your needs and any suggestions on how they might be made more useful.<br>
+
Rudy,Lately, we have been hearing a lot of concerns from both NASA HQ and our User Working
Stefan
+
Group about difficulty faced by users in determining whether a given dataset is the
 +
authoritative one. They particularly don't want to see a mutation being misinterpreted as,
 +
say, a NASA standard product. It may not be an issue in the AQ community, but the climate
 +
change community is a different story. Chris

Latest revision as of 00:00, June 17, 2011

ACP WCS services -- Stefan Falke (Sfalke) 9 June 2011 (MDT)

Martin, Here are the items we said we'd point you to - the ACP Viewer and WCSes with satellite derived gaseous columns:

The last WCS is new to me - found it through uFind. Rudy can probably give background.

We look forward to your feedback and follow-on discussions on how well these services and their data meet your needs and any suggestions on how they might be made more useful. Stefan

Re: ACP WCS services -- Clynnes 10 June 2011 (MDT)

Stefan, Are you sure CIRA is serving Tropospheric NO2? I only saw two coverages in the Capabilities doc, total column NO2 and total column O3. Chris

Re: Re: ACP WCS services -- Stefan Falke (Sfalke) 10 June 2011 (MDT)

Yes, but it's not listed in the GetCapabilities. The DataFed WCS is similar. They are set up as additional dimension fields. We'd need to have Rudy or Kari explain the details. You can see them when they are displayed in the DataFed browser.for example Stefan

Re: Re: Re: ACP WCS services -- Wyang 10 June 2011 (MDT)

This service uses WCS v1.1.2 where multiple fields can be included in one coverage. The following fields are served:
Colum Amount NO2 CS30
Colum Amount NO2 Troposphere CS30
CloudFraction
ColumnAmountO3
Wenli

Re: Re: Re: Re: ACP WCS services -- Rhusar 10 June 2011 (MDT)

Wenli, Thanks for pointing out that we use WCS version 1.1.2. We find that Version 1.1 offers a natural mapping for data structures used in air qulaity:
Coverage >>> maps to >>> Dataset, i.e. a set of homogeneous observations that share space/time dimensions. It may be a specific model, obs network etc.
Field >>> maps to >>> Obs or model Parameter, i.e. AOT, ColumnNO2, SurfaceSo4


CIRA has a NASA Applications grant to bring satellite data to their AQ users.. mostly the Visibility regulations group that uses the VIEWS Decision Support System. . CIRA gets the 'raw' OMI point data from NASA ACDISC as does DataFed: (1) download the daily files; (2) append it to the cumulative CF-netcdf file and (3) place the DataFed/Juelich WCS server on top of the big cumulative netCDF file.

The open-source WCS 1.1.2 server was installed at CIRA to serve both the station-point data from their SQL server as well as the CF-netCDF gridded data for models, satellites and emissions. DataFed/Juelich WCS server is also running at Stafan's place at Norhtrop, a second server at WashU, at NILU in Norway and at CIRA. These severs offer multiple WCS services for different kinds of data. The WCS server package is being considered for the delivery of (near real time) AQ data at the European Environmental Agency, Environment Canada and possibly at AIRnow.

The key goal of the AQ network is to preset to the AQ users a consistent, standard interface to all data. These are the satellite data that are in the CORE WCS network. Given the homogeneity of these offered distributed WCS services, it is then possible to create universal clients that access these services. The semantics of the offered data, i.e. the different aggregations an filters that are applied by a specific server may still be different for each server, e.g. the OMI columnar NO2 data offered by CIRA is a different flavor from the DataFed or from the NASA DISC WCS version.

=Re: Re: Re: Re: Re: ACP WCS services -- Wyang 10 June 2011 (MDT)=

Rudy, I agree. I think that the biggest improvement in v1.1 over v1.0 is its enablement of multi-field coverage, which is the common data model used in EO communities (e.g., netCDF and HDF data products). However, this version had not been widely implemented before OGC adopted a newer modular based specification design (and now a newer WCS2.0 spec is produced and is in testing phase). Wenli

==Re: Re: Re: Re: Re: Re: ACP WCS services -- Rhusar 17:46, 13 June 2011 (MDT)==

Wenli, the way things are, the number of WCS 1.0 services for air quality are also quite limited. Also, I think that the coverage/field structure of WCS 1.1 should migrate more smoothly to WCS 2.0

Distinguishing OMI data at different servers -- Clynnes 10 June 2011 (MDT)

Rudy, How does a WCS client distinguish between the OMI data served from our WCS server and the data served from CIRA? That is, how would a universal client make a decision on which one to use? Or is there no distinction between the two? Chris

Re: Distinguishing OMI data at different servers -- Mschultz 10 June 2011 (MDT)

Hi Christopher, a very important point indeed! While we are technically still working on just bringing things together, I fully agree with you that traceability of data sets will be a key issue as soon as we "go live". We should make a great effort to avoid duplication of data sets and establish some sort of catalogue which data sets are hosted on which server. Also, we must discuss the metadata model(s) and how the tracing information can be honored by the server software. As an example take the granule versus global field products from a satellite: it may be that one site (say NASA) serves the granules, while another site (for example DataFed) will use these data and process it to global fields which can then be used to compare them with surface obs or model results. There should be a way to "click" on the global fields to find out that they are derived from those NASA granule sets. Also, there should be a way to communicate changes to the granules from NASA to DataFed so that the global product can be re-generated in case of version changes. This shall definitively be a topic for discussion at the Solta meeting! [Rudy: please add to the agenda] Best regards, Martin

Re: Distinguishing OMI data at different servers -- Rhusar 10 June 2011 (MDT)

Chris, The two OMI NO2 data versions differ in their Discovery Metadata, specifically having different Distributor....in uFIND. Of course, the service endpoints for the two NO2 data are different:
http://viper.cira.colostate.edu:8080/gsfc&coverage=ColumnAmountNO2&field=NO2Trop
http://data1.datafed.net:8080/ACDISC&coverage=OMNO2&field=NO2Trop
R

Re: Re: Distinguishing OMI data at different servers -- Clynnes 10 June 2011 (MDT)

Rudy,That means that we have 3 versions in the wild, not 2, if you include our GES DISC WCS server. Chris

Re: Re: Re: Distinguishing OMI data at different servers -- Rhusar 10 June 2011 (MDT)

Yup, there are multiple OMI NO2 versions out there, each having different value-adding features of the service and mutations of the processed data. Is having different versions good or good or bad? It depends..what you want to do.. For research might as well get the data out to the science wilderness and let it mutate until it becomes robust enough to survive the tooth of time. Rudy

Re: Re: Re: Re: Distinguishing OMI data at different servers -- Mschultz 10 June 2011 (MDT)

Rudy, No - I strongly disagree here! This means that the choice of the dataset will in practice be determined by the wrong criteria (accessibility, data format, access speed, etc.) and not by the scientific robustness! There have been very tough discussions on data set quality in the GAW programme for example and just yesterday I learned about a nice project in Germany where dataset quality and a review process for "publication" of data are investigated. I strongly believe that any data set "out in the wild" should be traceable back to the originator and all modifications must be documented. Cheers, Martin

=Re: Re: Re: Re: Re: Distinguishing OMI data at different servers -- Gleptukh 10 June 2011 (MDT)=

I fully agree with Martin here. The quality of the original data is not going to improve by mutating and mating the incorrect or versions of the data. Provenance or, at least, attribution is paramount. To look positively, we have reached the point when technology progress is pushing us to force not only providers and brokers, but also clients to deal with attribution. Thanks, Greg

Re: Re: Re: Re: Distinguishing OMI data at different servers -- Clynnes 10 June 2011 (MDT)

Rudy,Lately, we have been hearing a lot of concerns from both NASA HQ and our User Working Group about difficulty faced by users in determining whether a given dataset is the authoritative one. They particularly don't want to see a mutation being misinterpreted as, say, a NASA standard product. It may not be an issue in the AQ community, but the climate change community is a different story. Chris