WCS Development Issues

From Earth Science Information Partners (ESIP)

Back to WCS Wrapper

Miscellaneous Things that have come up in design, development and deployment of datafed WCS system. This includes email-conversations and personal observations.

Using Datafed client on NILU test DB with slightly irregular time dimension [Hoijarvi]

http://webapps.datafed.net/datafed.aspx?page=test/NILU

The map view is an aggregate over 30 months, so it's kind of slow.

Since there is some problem with the time periodicity in EBAS, I made the map view to aggregate over one month. This demonstrated seasonal browsing.

http://webapps.datafed.net/datafed.aspx?page=test/NILU_month

Running WCS on NILU EBAS schema using sybase [Hoijarvi Eckhardt]

Hello,

It looks actually pretty good. I can browse it better than I expected with the first try. But please check your DB connection. Currently your server reports an error:

Layer: 0, Origin: 0 Unable to connect: Adaptive Server is unavailable or does not exist

I made a test page for tweaking and testing the settings:

http://webapps.datafed.net/datafed.aspx?page=test/NILU

As soon as you have the DB online, I'll improve it. Currently there is time range aggregate on map view so that you can see something.

Other than that, the thing looks good, here are my first comments:

   On 10/27/2010 8:27 AM, Paul Eckhardt wrote:
   > Those tests revealed some more questions:
   >
   > 1) for one coverage you assume a constant measurement interval, not only
   > across all locations, but also over the whole time series.

This is can be dealt in many ways. In general, the data fields in a coverage should share dimensions. So if you have data that actually does share dimensions, put them in one coverage. If you have hourly data and aggregated daily averages, make two coverages. If most of the locations are the same, data can be in one coverage. If location tables are different, make separate coverages. That's the easy part.

It seems, that EMEP data does not fit into this simplistic form, and therefore we have to think how to configure the WCS.

   > This is generally not the case for emep data:
   > 1a) every station has their own measurement cycle (some changing filters
   > at midnight, others at 6am, this will again differ for the precipitation
   > samplers.
   No comment about the importance or consequences about this.
   > 1b) not all measurements have the same resolution (everything from
   > minute resolution to mothly averages is in principle possible)

This probably results in separating data by time resolution to separate coverages.

   > 1b) some stations change their schedule over time.
   > 1c) there is no guarantee, that a timeseries is complete (i.e. no gaps)

This has not been a problem. For example CIRA/VIEWS has data twice a week, Still, we consider it a daily dataset.

http://webapps.datafed.net/datafed.aspx?wcs=http://128.252.202.19:8080/CIRA&coverage=VIEWS&field=SO4f&datetime=2007-02-08

In general, when the field is normalized into triple (loc_code, datetime, value), missing data is just a missing row.

   > 1d) time series are not necessarily unique (there might me multiple
   > timeseries for one parameter at one location)
   

Does this mean, that there could be hourly and daily SO4? That's no problem, again those two are separate coverages.

If the data is truly random, like 1-20 measures at random times during the day, and separate in each location, then we cannot have periodic time dimensions. The time dimension must then be just a range, not with periodicity or enumerated times.

I haven't tested such coverages, yours might then be the first one. There's no technical reason why it should be a huge problem.

   > To address 1b, I added a filter criteria to only use timeseries with a
   > resolution of 1 day.

Seems to work for this test!

   > For the rest, I'm a bit lost. Maybe we need to aggregate (homogenize the
   > intervals) for all measurements in an intermediate database, in order to
   > fit in the OGC model?
   > Do we need to aggregate a unique value for each station, parameter, and day?

The wonderful thing about SQL is, that it gives you this option but does not enforce it. We compiled GSOD, Global Summary Of the Day, and did a ton of on-the-fly calculations using SQL views, producing what we needed directly from the original data. Later, some views were materialized for performance reasons. If you feel so, go ahead and create tables for WCS views and populating them from data.

   > 2) the parameter naming conventions are not easily convertible with the
   > emep naming.
   > As an example there is SO4f (which I interpreted as SO4 particles in the
   > pm2.5 fraction - is this correct?), but emep distinguishes between
   > sea-salt-corrected SO4 and total SO4. Same for SO4t (which I understand
   > is the whole aerosol fraction, without a size cutoff?).
   > Please see http://tarantula.nilu.no/projects/ccc/nasaames/complist.html
   > for a (hopefully up-to-date?) list of components/matrixes used in emep.

We are using CF standard names:

http://webapps.datafed.net/table.aspx?database=catalog&table=standard_names

compiled from http://cf-pcmdi.llnl.gov/documents/cf-standard-names/standard-name-table/15/cf-standard-name-table.xml

If you have two same measures of SO4 done with two different type instruments and you want to publish both, go ahead and call the field SO4_instrA and SO4_instrB but choose the standard_name attribute from the CF table if possible.

I'm not an expert in CF naming, but sea-salt-corrected SO4 and total SO4 sound like two different CF standard names. Anyway, your fields can be called sea_salt_corrected_SO4 and total_SO4, that's fine. They may have same or different standard name, that's no problem.

   >
   > Minor issues:
   > If I use the datafed browser, I can see some hiccups. I'm not sure if
   > this is due to my wrong configuration of the ows software...
   >
   > 1) the map layer is empty. Are there no features for the european
   > region, or is there a problem with the geo-references from my side?
   I had left North America map to the default view. I changed it into world map. The browser  should pick a proper map based of catalog defaults.
   > 2) the defuault time displayed in the browser is 2010-07-29T06:00:00,
   > which i can not relate to any response from the wcs server. The
   > describeCoverage response contains:
   > <TemporalDomain>
   >   <TimePeriod>
   >    <BeginPosition>1972-01-01T06:00:00Z</BeginPosition>
   >    <EndPosition>2007-01-07T06:00:00Z</EndPosition>
   >    <TimeResolution>P1D</TimeResolution>
   >   </TimePeriod>
   > </TemporalDomain>

There is a problem in "make_time_dimension" function in NILU_config.py. I filled the test database with daily data.

The time_min must have the same resolution as the data. So if you have daily data, it must not have hours. If it's hourly, it cannot have minutes.

So change

   time_min = iso_time.parse(row[i_time_min]).dt
   time_max = iso_time.parse(row[i_time_max]).dt

to

   time_min = iso_time.parse(row[i_time_min]).dt.date()
   time_max = iso_time.parse(row[i_time_max]).dt.date()

and the periodic dimension should work. I have to add a check for this in the iso_time module

The SQL query is then compiled by datetime='2010-07-29T00:00:00' and requires precision, one second off and it won't return anything. If the data is not on precise time intervals, you can in the browser set time window access for map view, to get all the data from datetime plus minus short period.


   > 3) changing the time field in the browser updates the map, but somehow
   > the timeseries display is not changed (still showing 2010)
   > I can manually "zoom" out&  in the time axes of the time series plot though.
   Timeseries display should update the cursor, blue vertical line. It does not update the time zoom.
   > 4) once I get the timeseries displayed, the y-scale seems to be fixed to
   > [0.0, 1.0]. What might be the reason for this? The values reported by
   > the wcs are correct, and agree with what is plotted (at least the values
   > below 1.0)

You can set a different scale. For each field, I have to set a default zoom, currently manually, but I'm planning to monitor registered WCS's and then I can have a script generating a better default. Click on the "Service Program" button to change settings.

   >
   > However, here's the best display I could get:
   > 2010-10-27-141538_620x691_scrot.png
   > This is a display for 15th Jan 2006, 06:00. The cross at the location of
   > Birkenes, Norway and the corresponding timeseries plotted correctly.

That's better than I expected. As soon as you have the DB online, I'll take another look.

   > Kind regards
   >
   > Paul
   >
   >

Two more things, make sure you don't have critical passwords in your python code. The sources are http accessible. Create a read-only account and for paranoia hide the connection module outside OWS/web/static.

Also in http://knulp.nilu.no:8080/static/NILU/wcs_capabilities.conf change the "Distributor:DataFed" to "Distributor:NILU"

You should add "IS NOT NULL" in the filters for location and data views, as well as filter out data with -999 as missing marker. I never use nulls or -999 except when I have to filter them out at the lowest level data access.

Greetings, Kari

Existing HTAP Services [Hoijarvi Decker]

Look into the point provider how to set up database for point WCS

The CIRA provider shows how to make a custom processor for a database where you cannot create views.

Kari

   On 10/27/2010 2:57 AM, Michael Decker wrote:
   > Hi,
   >
   > those should all be valid and unique providers. HTAP_monthly* should be fully CF-compliant, 
   in the others that might not be entirely so, but that might improve in the future.
   >
   > Another thing:
   > I guess we will also have a look at point station data soon. Do you have some documentation 
   about what you have done there so far or some helpful query links for me to look at? I did not 
   look at the source code for that at all so far but probably will within the next few weeks...
   >
   > Michael
   >
   > On 10/26/10 17:37, Kari Hoijarvi wrote:
   >> Hi,
   >>
   >> I have registered these services into our catalog:
   >>
   >> http://webapps.datafed.net/HTAP.uFIND
   >> http://webapps.datafed.net/catalog.aspx?table=WCS_services&distributor=juelich
   >>
   >>
   >> Are all these valid, or are some of them duplicates?
   >>
   >> HTAP_FC_pressure
   >> HTAP_monthly
   >> MACC_bnds
   >> HTAP_FC
   >> HTAP_FE
   >> HTAP_FE_hourly
   >> HTAP_monthly_pressure
   >> HTAP_FE_pressure
   >>
   >
   >

Getting Timeseries from WCS [Hoijarvi Falke]

Expand the bbox a little bit:

http://webapps.datafed.net/cov_73556.ogc?SERVICE=WCS&REQUEST=GetCoverage&VERSION=1.0.0&CRS=EPSG:4326&COVERAGE=DD&TIME=2001-01-01T00:00:00/2001-01-31T00:00:00&BBOX=-119.6109,36.2301,-117.6109,38.2301,0,0&WIDTH=1&HEIGHT=1&DEPTH=-1&FORMAT=CSV

There is a known problem with rounding. In the version of OWS I should fix this.

By the way, you can call the wcs 1.1 services via our 1.0 proxy, but don't send these links anywhere. It's only set up so that our browser will work. The id 73556 is completely random and may change any time. So use the real service instead:

http://ww10.geoenterpriselab.com/dd?service=WCS&version=1.1.2&Request=GetCoverage&identifier=dd&BoundingBox=-119.6109,36.2301,-117.6109,38.2301,urn:ogc:def:crs:OGC:2:84&TimeSequence=2001-01-01T00:00:00/2001-01-31T00:00:00&format=image/netcdf&store=false

store=false works fine with firefox.

Kari

   On 10/26/2010 3:53 PM, Falke, Stefan R (IS) wrote:
   >
   > Kari,
   >
   >  
   >
   > I’m trying to get a GetCoverage example for a time series request 
   on the degree day dataset. I’ve created the following using the TimeSeries 
   WCS Query form in the DataFed Browser but it doesn’t return anything. 
   Do you see anything wrong with the query? I made the request for a single 
   lat-lon point, which creates an unusal bbound in the WCS request but 
   I don’t know if that’s the issue:
   >
   >  
   >
   > http://webapps.datafed.net/cov_73556.ogc?SERVICE=WCS&REQUEST=GetCoverage&VERSION=1.0.0&CRS=EPSG:4326&COVERAGE=DD&TIME=2001-01-01T00:00:00/2001-01-31T00:00:00&BBOX=-118.6109,37.2301,-118.6109,37.2301,0,0&WIDTH=1&HEIGHT=1&DEPTH=-1&FORMAT=CSV
   >
   >  
   >
   > I generated this from this page:
   >
   > http://webapps.datafed.net/datafed.aspx?page=NGC/DegreeDays
   >
   >  
   >
   > Thanks,
   >
   > Stefan

GeoEnterpiseLab got new service online [Hoijarvi Roberts Falke]

   You could create yourself an organization here:
   http://webapps.datafed.net/table.aspx?database=catalog&table=organizations&mode=edit&edit=7019&edit_mode=dup&message=Creating+new+from+existing+item.
   Choose a nice looking abbreviation like GEOEL, enter the new information and save. Then you can use that abbreviation as distributor and originator.
   Kari
   On 10/26/2010 12:55 PM, Roberts, Gregory (IS) wrote:
   > Kari,
   >  
   > I updated the wcs_capabilities.conf and also added index.html to each wcs directory.  
   One thing though, I had to remove the Distributor:NorthropGrumman and Originator:NorthropGrumman 
   from the keywords, since we are not allowed to show where the data is from.  Let me know if these are correct.
   >  
   > They're there.
   >
   > http://webapps.datafed.net/catalog.aspx?table=WCS_services&sort=domain+asc
   >
   > Greg, could you please add the keywords to wcs_capabilities.conf so that the catalog search facility can be used.
   >
   >
   > Kari
   >


http://webapps.datafed.net/catalog.aspx?table=WCS_coverages&originator=GEL&pagesize=50

Looks good.

Kari

   On 10/26/2010 2:08 PM, Roberts, Gregory (IS) wrote:
   > I created GeoEnterprise Lab (GEL and updated all the wcs_capabilities.conf with Distributor:GEL, Originator:GEL
   >  
   >

Unit tests to fight regression bugs [Hoijarvi Decker]

I pulled and reviewed your patches already.

Unit tests are great indeed and i'm using them for my own code as well. The problem with the OWS unit tests is that many of them never worked for me because of platform differences, etc. So I started ignoring them. I agree that this is a very poor solution and we should try to make them work for everybody. I will look into them again.

Michael

   On 10/25/10 17:01, Kari Hoijarvi wrote:
   > Thanks.
   >
   > You should add a unit test for that. They pay off in the long run,
   > especially in cross-platform porting when I write minio.
   >
   >
   >
   > There are a few patches that you should pull, review, and publish:
   >
   > * fixed windows-only is_string call
   >
   > that fixes a windows-only wrong name bug
   >
   >
   > * use_nio flag by sys.platform
   >
   > A little more convenient way to set use_nio
   >
   >
   >
   > * added SupportedCRS and SupportedFormat to wcs capabilities
   >
   > This probably does nothing more but adds 4 nodes to each
   > coveragedescription in the capabilities, but I put it there anyway.
   >
   >
   > Kari
   >
   >
   > On 10/25/2010 2:52 AM, Michael Decker wrote:
   >> It's fixed and published now. When I changed the code for X/Y
   >> filtering, I forgot the T filter...
   >>
   >> Michael
   >>
   >> On 10/21/10 22:56, Kari Hoijarvi wrote:
   >>> This used to work, it was in my unit tests.
   >>>
   >>> http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?Service=WCS&Version=1.1.2&Request=GetCoverage&Identifier=GEMAQ-v1p0_SR5SA_tracerm_2001&Format=application/x-netcdf&Store=false&TimeSequence=2001-04-16&RangeSubset=ap&BoundingBox=-180,-90,360,90,urn:ogc:def:crs:OGC:2:84
   >>>
   >>>
   >>>
   >>> It reports: unhashable type: 'list'
   >>>
   >>> Is this a newly introduced regression bug?
   >>>
   >>> Kari
   >>>
   >>>
   >>
   >
   -- 
   Michael Decker
   Forschungszentrum Jülich
   Institut für Energie- und Klimaforschung - Troposphäre (IEK-8)
   Tel.: +49 2461 61-3867
   E-Mail: m.decker@fz-juelich.de

NILU/EBAS Gets Going [Hoijarvi Eckhardt]

Fantastic!

Best

R

   2010/10/21 Paul Eckhardt <Paul.Eckhardt@nilu.no>
   Hi Rudy,
   I actually started testing Kari's version yesterday. I got his prototype (using the sqlite database) running in a local virtual Linux machine so far. Next, I will look into connecting our Sybase database directly. I'm currently starting to get into it. Can I contact you next week? Then I'll know more about how far I could get on my own, or at least will be able to ask specific questions.
   Thanks for your support so far!
   Paul


   On 22.10.2010, at 01:05, Rudolf Husar <rhusar@me.wustl.edu> wrote:
   >     Hello Paul and Aasmund,
   >
   >     Hope life is treating you well... maybe way too busy but well.
   >
   >     Could we have a brief session on the WCS server implementation for 
   EMEP data on EBAS? Kari is now quite familiar with your data structure 
   and he would be happy to help connecting the SQL server to the WCS server software.
   >
   >
   >     Best regards,
   >
   >     Rudy
   >
   >      
   >
   >     On Fri, Oct 8, 2010 at 3:04 PM, Kari Hoijarvi <hoijarvi@seas.wustl.edu> wrote:
   >
   >          Hello all,
   >
   >         I have configured our WCS server for NILU database schema. 
   I copied stations and test data from CIRA/VIEWS, filled up a sqlite database and here it is online:
   >
   >         http://128.252.202.19:8080/NILU
   >
   >         You can also view it online via our browser. 
   http://webapps.datafed.net/datafed.aspx?wcs=http://128.252.202.19:8080/NILU&coverage=EBAS&field=SO4f
   >
   >         The defaults are bad, click the next day in time controller to go to 2006-11-13 and 
   >          you'll see some data.
   >
   >         To install this, you need a Windows machine with a public IP address.
   >         The project is at http://sourceforge.net/p/aq-ogc-services/home/
   >         Installation instructions. http://wiki.esipfed.org/index.php/WCS_Wrapper_Installation_WindowsOP
   >         I made simplifications to the code, the netcdf libraries are now an option.
   >         Once you have the point demo provider working, you can download your own stuff:
   >         http://128.252.202.19:8080/static/NILU/NILU.zip
   >         Unzip this and move the NILU folder to C:\OWS\web\static\ just where the point folder is. You should have now the NILU mock server running.
   >         The files under web\static\NILU
   >         EBAS.db: This is the mock sqlite database with dummy test data.
   >         index.html: Your home page
   >         wcs_capabilities.conf: Keywords, Contact information etc.
   >         NILU_wcs.py: The WCS server component.
   >         NILU_wfs.py: The WFS server component for the location table.
   >         EBAS_data.py: The main script that created the EBAS.db
   >         EBAS_sql.py: The SQL commands for EBAS_data.py
   >         NILU_config.py: The python dictionary/script that contains the WCS configuration
   >         To configure this to use the real sybase DBMS:
   >         Edit the EBAS_sql.py and change the def connect() method to return the real db connection. 
   >         You can either use direct sybase module for python or go via ODBC.
   >         The EBAS schema is properly normalized, which allows using SQL as a configuration tool, 
   >         using SQL views. If you absolutely don't want to create views, then we need to configure this
   >         WCS by adding rather complicated joins and aliases. Views are a lot simpler, and give a chance 
   >         later for optimizations by taking a snapshot out of them. The script that creates WCS_location 
   >         view, and a view like [WCS_SO4f] for each parameter, is in EBAS_sql.py. The methods 
   >         create_location_view and create_wcs_view do the trick. They were called when EBAS_data.py 
   >         created the mock database.
   >         Then announce the fields in NILU_config.py. The dictionary conf_dict in method compile_config is 
   >         quite self-explanatory. It lists the parameter names units, and keywords. The script the adds the 
   >         location dimension axis description in each field. After getting a hand-edited configuration 
   >         working, I'd recommend reading the field names from the database, so that the configration is 
   >         automatic. In ows-point-1.2.3.zip I'm demonstrating that technique in CIRA, I scan the VIEWS 
   >         database and compile the fields myself.
   >         The decision what is a coverage and what is a field is quite simple. A coverage is a collection 
   >         of fields that share the time dimension and spatial boundaries. So if you have same data collected 
   >         with different intervals, create an own coverage for each time interval.
   >         Now the fields share also the location dimension. The standard does not require this, if some station 
   >         does not collect a parameter, that field should not have that station in the location table. 
   >         This is in the todo list. Also there is a limitation, that you can query only one field at a time 
   >         from a point coverage. This will be addressed to allow multiple fields like in NetCDF cube queries.
   >         Good luck with installation. I will be in the office every weekday, feel free to email, skype or 
   >         call 1-314-935-6099 if you have questions. I'll be happy to help. Please remember, that this is 
   >         work in progress. I have discussed with Michael Decker about restructuring to configuration 
   >         dictionary, currently it has grown organically and the names are inconsistent. We need to 
   >         standardize to the CF-1.4 names everywhere. This means some renaming in the future, but probably 
   >         no more.
   >         Once you have your server up, you can try to browse the data with http://webapps.datafed.net/datafed.aspx?wcs=http://your.domain.here:8080/NILU&coverage=EBAS Currently the browser registers it once and updates changes daily. So if you add new fields don't expect our browser to show it at once.
   >
   >         Good Luck, Kari
   >
   >         On 9/30/2010 10:13 AM, Paul Eckhardt wrote:
   >
   >             Hi Kari and Rudy,
   >
   >             thank you for taking the time on the phone and for offering to further
   >             assist us!
   >
   >             Attached you can find the complete DDL (ebas.sql) for creating an empty
   >             ebas database (this version is a bit outdated, but it should serve our
   >             purpose).
   >
   >             I created a very reduced version for testing that just contains what I
   >             think might be interesting for the WCS server. This can be found in
   >             ebas_reduced.sql
   >
   >
   >             Some basic information about the design:
   >
   >             DS_DATA_SET describes what we call a dataset in ebas: one parameter
   >             measured at a certain station (in the full version some more complex
   >             dependecies on instrument, method etc.).
   >             ER_REGIME_CODE is always 'IMG' for observations.
   >             EM_MATRIX_NAME defines which medium the parameter is measured in (e.g.
   >             precipitation, air, pm10, ...)
   >             EC_COMP_NAME is the name of the parameter (e.g. sulphate_total, ...)
   >             DS_STARTDATE and DS_ENDDATE provide the timestamp of the first and last
   >             measurement in the timeseries.
   >
   >             A1_TIME contains the measurements. Relates n:1 to DS_DATA_SETS,
   >             FK=DS_SET_KEY.
   >
   >             EB_STATION containd the location data. relates 1:n to DS_DATA_SETS.
   >
   >
   >             In case you have questions or need more or different information, please
   >             email to me and Aasmund.
   >
   >             Cheers,
   >
   >             Paul
   >
   >
   >
   >
   >
   >
   >     -- 
   >     Rudolf B. Husar, Professor and Director
   >     Center for Air Pollution Impact and Trend Analysis (CAPITA),
   >     Washington University,
   >     1 Brookings Drive, Box 1124
   >     St. Louis, MO 63130
   >     +1 314 935 6054

GeoEnterpiseLab Installs the new WCS implementation [Hoijarvi Roberts]

Hi,

Thanks for installing the new version of OWS. I Registered your new services, and they work fine. However, we are moving away from hand-registering each service as a dataset and going towards having one WCS catalog:

http://webapps.datafed.net/catalog.aspx?pagesize=50&table=WCS_services

You can browse at service, coverage or field level. However, to make your service discoverable, you need to add the proper keywords to the wcs_capabilities.conf files:

KEYWORDS: Domain:Fire, Platform:Model, Instrument:Unknown, DataType:Grid, Distributor:NorthropGrumman, Originator:NorthropGrumman, TimeRes:Day, Vertical:Surface, TopicCategory:climatologyMeteorologyAtmosphere

TimeRes is in singular. The other metadata should also be filled out.

By putting an index.html file in the provider folder, you will get a home page for each provider. Like http://ww10.geoenterpriselab.com/vulcan now says "no index.html file available for chosen provider"

In vulcan and dd, you have one dummy time dimension. Coverages that don't have time dimension, can just leave it out.

You should have attribute <attribute name="_FillValue" type="float" value="NaN" /> in the vulcan and dd .ncml files. Now the cubes are filled with huge values.

Kari


   On 10/21/2010 7:53 AM, Stefan Falke wrote:
   > Hi Kari,
   >
   > We have created two new WCSes using your DataFed netCDF-Cf WCS package. Can they be registered in DataFed?
   >
   > 1) Vulcan CO2 emission estimates. Please put in 'emissions' domain
   > http://ww10.geoenterpriselab.com/dd?service=WCS&acceptversions=1.1.2&Request=GetCapabilities <http://ww10.geoenterpriselab.com/dd?service=WCS&acceptversions=1.1.2&Request=GetCapabilities>
   >
   > 2) Daily degree days (test climate model output) Please put in 'Test' domain
   > http://ww10.geoenterpriselab.com/vulcan?service=WCS&acceptversions=1.1.2&Request=GetCapabilities <http://ww10.geoenterpriselab.com/vulcan?service=WCS&acceptversions=1.1.2&Request=GetCapabilities>
   >
   > We will be generating a few more in the coming weeks. Is there anything we can do on our side to help with the registration in DataFed?
   >
   > Thanks,
   > Stefan

Creating a mock DB for NILU/EBAS schema [Hoijarvi Eckhardt Martin Decker Husar Vik]

Hello all,

I have configured our WCS server for NILU database schema. I copied stations and test data from CIRA/VIEWS, filled up a sqlite database and here it is online:

http://128.252.202.19:8080/NILU

You can also view it online via our browser. http://webapps.datafed.net/datafed.aspx?wcs=http://128.252.202.19:8080/NILU&coverage=EBAS&field=SO4f

The defaults are bad, click the next day in time controller to go to 2006-11-13 and you'll see some data.

To install this, you need a Windows machine with a public IP address.

The project is at http://sourceforge.net/p/aq-ogc-services/home/

Installation instructions. http://wiki.esipfed.org/index.php/WCS_Wrapper_Installation_WindowsOP

I made simplifications to the code, the netcdf libraries are now an option.

Once you have the point demo provider working, you can download your own stuff:

http://128.252.202.19:8080/static/NILU/NILU.zip

Unzip this and move the NILU folder to C:\OWS\web\static\ just where the point folder is. You should have now the NILU mock server running.

The files under web\static\NILU

EBAS.db: This is the mock sqlite database with dummy test data. index.html: Your home page wcs_capabilities.conf: Keywords, Contact information etc. NILU_wcs.py: The WCS server component. NILU_wfs.py: The WFS server component for the location table. EBAS_data.py: The main script that created the EBAS.db EBAS_sql.py: The SQL commands for EBAS_data.py NILU_config.py: The python dictionary/script that contains the WCS configuration

To configure this to use the real sybase DBMS:

Edit the EBAS_sql.py and change the def connect() method to return the real db connection. You can either use direct sybase module for python or go via ODBC.

The EBAS schema is properly normalized, which allows using SQL as a configuration tool, using SQL views. If you absolutely don't want to create views, then we need to configure this WCS by adding rather complicated joins and aliases. Views are a lot simpler, and give a chance later for optimizations by taking a snapshot out of them. The script that creates WCS_location view, and a view like [WCS_SO4f] for each parameter, is in EBAS_sql.py. The methods create_location_view and create_wcs_view do the trick. They were called when EBAS_data.py created the mock database.

Then announce the fields in NILU_config.py. The dictionary conf_dict in method compile_config is quite self-explanatory. It lists the parameter names units, and keywords. The script the adds the location dimension axis description in each field. After getting a hand-edited configuration working, I'd recommend reading the field names from the database, so that the configration is automatic. In ows-point-1.2.3.zip I'm demonstrating that technique in CIRA, I scan the VIEWS database and compile the fields myself.

The decision what is a coverage and what is a field is quite simple. A coverage is a collection of fields that share the time dimension and spatial boundaries. So if you have same data collected with different intervals, create an own coverage for each time interval.

Now the fields share also the location dimension. The standard does not require this, if some station does not collect a parameter, that field should not have that station in the location table. This is in the todo list. Also there is a limitation, that you can query only one field at a time from a point coverage. This will be addressed to allow multiple fields like in NetCDF cube queries.

Good luck with installation. I will be in the office every weekday, feel free to email, skype or call 1-314-935-6099 if you have questions. I'll be happy to help. Please remember, that this is work in progress. I have discussed with Michael Decker about restructuring to configuration dictionary, currently it has grown organically and the names are inconsistent. We need to standardize to the CF-1.4 names everywhere. This means some renaming in the future, but probably no more.

Once you have your server up, you can try to browse the data with http://webapps.datafed.net/datafed.aspx?wcs=http://your.domain.here:8080/NILU&coverage=EBAS Currently the browser registers it once and updates changes daily. So if you add new fields don't expect our browser to show it at once.

Good Luck, Kari

   On 9/30/2010 10:13 AM, Paul Eckhardt wrote:
   > Hi Kari and Rudy,
   >
   > thank you for taking the time on the phone and for offering to further
   > assist us!
   >
   > Attached you can find the complete DDL (ebas.sql) for creating an empty
   > ebas database (this version is a bit outdated, but it should serve our
   > purpose).
   >
   > I created a very reduced version for testing that just contains what I
   > think might be interesting for the WCS server. This can be found in
   > ebas_reduced.sql
   >
   >
   > Some basic information about the design:
   >
   > DS_DATA_SET describes what we call a dataset in ebas: one parameter
   > measured at a certain station (in the full version some more complex
   > dependecies on instrument, method etc.).
   > ER_REGIME_CODE is always 'IMG' for observations.
   > EM_MATRIX_NAME defines which medium the parameter is measured in (e.g.
   > precipitation, air, pm10, ...)
   > EC_COMP_NAME is the name of the parameter (e.g. sulphate_total, ...)
   > DS_STARTDATE and DS_ENDDATE provide the timestamp of the first and last
   > measurement in the timeseries.
   >
   > A1_TIME contains the measurements. Relates n:1 to DS_DATA_SETS,
   > FK=DS_SET_KEY.
   >
   > EB_STATION containd the location data. relates 1:n to DS_DATA_SETS.
   >
   >
   > In case you have questions or need more or different information, please
   > email to me and Aasmund.
   >
   > Cheers,
   >
   > Paul
   >
   >

Time to upgrade at GeoEnterpiseLabs [Hoijarvi Roberts]

Nice to hear from you.

Your problem means you have to install the newest version. A lot has happened since you became the first external user for us. So much actually, that I don't have anymore the code you are running, except deep in version control. I have fixed some bugs that may have caused the error, but I don't know how to work around it.

Anyway, I tested your netcdf file, and it works with the latest version http://sourceforge.net/p/aq-ogc-services/home/ from sourceforge.

I zipped up the vulcan folder for you. Try it out after installation.

Windows installation instructions are still here:

http://wiki.esipfed.org/index.php/WCS_Wrapper_Installation_WindowsOP

Libraries have changed, especially NetCDF now supports over 2GB cubes, so you have to re-install all from scratch.

Significant changes:

- supports versions up to 1.1.2

- Xml documents are actually validated with WCS schema. The originals are full of errors. If you have code that was reading the incorrect 1.1.0 getcapabilites and describecoverage documents, they need to be rewritten.

- Preparing capabilities and describecoverage documents are now done using "owsadmin wcs_prepare -ao" command which extracts the metadata into the metadata.dat file

- Contact information is in the wcs_capabilities.conf file

- describecoverage query requires correct parameter "identifiers=vulcan" instead of wrong "identifier=vulcan"

- store=false is supported, works well with

If you install the new version, I can add you our catalog again:

http://webapps.datafed.net/catalog.aspx?pagesize=50&table=WCS_services

Kari


  On 10/18/2010 1:08 PM, Roberts, Gregory (IS) wrote:
  >
  > Kari,
  >
  >  
  >
  > I am getting this back when I attempt to run a GetCoverage after cubing the data.  
  I’ve attached the files so you can see what I have and why it is failing.
  >
  >  
  >
  > Faulting application python.exe, version 0.0.0.0, time stamp 0x49ee4354, faulting module 
  nc3.pyd, version 0.0.0.0, time stamp 0x4a65f3d9, exception code 0x80000003, fault offset 
  0x000121cb, process id 0x8d0, application start time 0x01cb6edbc4c9819a.

Software Development best practices [Hoijarvi Husar]

This is a short description of the datafed OWS system development, what were the issues and what went well.

Some terms:

DVCS: Distributed Version Control System

OWS: OpenGeoSpatial Web Services.

WCS: Web Coverage Service Used for delivering gridded data, mainly in NetCDF format. Datafed extension uses it also for delivering point data, other formats like trajectory are possible.

WFS: Web Feature Service Used for delivering static content, in this case station locations.

WMS: Web Map Service Used to deliver data as a map image.


Description of Development and it's best Practices.


Version Control

One of the key tools to use in any software project is version tracking. Even in one-person projects, it is useful to have a track record for each change that was made. and if a defect is found from previously working code, it is possible to go back and see what has changed. Many quality regressions can be fixed faster if original code is available.

In multi person projects version control is a must. First and foremost, merging changes between developers are automated and therefore accidental overwrites of previous changes are prevented. My choice of tool was Darcs http://darcs.net/ which is considered to be the most elegant of the current tools. Other candidates were Git http://git-scm.com/ and Mercurial http://mercurial.selenic.com/ of which Git was rejected mainly because of second class MS Windows support. Mercurial would have been a good choice too, but Darcs has simpler interface and more flexibility.

A DVCS, like Darcs, is a peer-to-peer system. There is no central code database, every repository holds all of the code and the code history. You can do anything you want in your own repositories, and then publish your changes. It is up to the other parties to pull them from your public repository.

The "Darcs Workflow" picture shows our use of Darcs. There are three repositories in Datafed. I do the actual work in the development repository. When a feature is ready, I record it as one or more patches, and push them to the public repository. Michael in Juelich can then pull my patches into his development repository.

When Michael finishes a feature, he pushes the change to his public repository. I then pull it to my review repository, check the changes, and push them to my development and public repositories.

Sometimes in the middle of a big development work I find a bug. If the fix is small enough, I don't need to publish all the changes at once. I can record the bug fix into a patch and push it to the public repository, and continue working on my big feature. If the fix is large, I can create a temporary repository, record the fixes and publish the fix patch. Then merge my changes in my development repository, and go on with work.

Sometimes two developers modify the same line at the same time. In this case, you get a conflict, which is annoying but also a fact of life in distributed development. For some in-depth information, see http://en.labs.wikimedia.org/wiki/Understanding_darcs/Patch_theory_and_conflicts

For those who want to see deeper what version control with Darcs means, there is a nice 6 minute video http://projects.haskell.org/camp/unique showing the Darcs advantages over Git, Mercurial and other tools.

Getting the source code: Get latest darcs 2.4.x executable from http://darcs.net and do

darcs get http://webapps.datafed.net/nest/OWS (From Datafed)

or

darcs get http://htap.icg.fz-juelich.de/darcs/OWS (From Juelich)

Remember, darcs is distributed. Both Datafed and Juelich have the complete repository. There's no difference where you get them.



Unit Testing.

Datafed OWS is designed to be a cross platform framework, developed in distributed open source environment. Here are a few problems that occur under these conditions:

- Subtle differences between operating systems may cause defects. Example: In unix variants it is possible to delete a file when another program has that same file open. In windows, this causes an error.

- Upgrading python may cause defects. The system was developed with python 2.5.1, currently the used version is 2.6.5, and python 2.7 has just came out. Python 3.x versions are intentionally backwards incompatible, and most likely will not even compile.

- It is possible to port the used low-level languages C and C++ to multiple operating systems, but they do not attempt to force portability like Java does. For example: whether an int is signed or unsigned by default is left for the compiler to decide; compiling both 32 bit and 64 bit versions is definitely error prone.

- Upgrading a 3rd party library may cause defects. The netcdf and lxml libraries are actively developed, and subtle differences or regression defects in new versions may cause defects.

- Changing any code may cause new defects. It is a difficult challenge to change something, keeping existing functionality while adding new. This is is aggravated by the fact, that developers in different time zones are working in parallel without having a comprehensive knowledge of the whole system.

- Different time zones between developers make communication slow. This means, that code documenting is up to date, so that shared libraries have up to date documents of their usage.

The solution for these problems is automated testing. Currently the system has 225 unit tests and 28 acceptance tests. The unit tests are just library calls, the acceptance tests are run through a real web server.

The test code/production code ratio is about 1:1. Actually, it's 7:10 but this is unfair as a test metric since C++ is tested with python and C++ is much more verbose.

The whole suite takes only 20 seconds to run, which gives a possibility to do test-driven development, TDD. It was popularized by Kent Beck in eXtreme Programming, and has evolved into an independent practice. The TDD repeating Red-Green-Refactor cycle:

- Red: Write a new test or modify an existing test, and see that it actually fails. It is important to see the actual failure, it is very easy to write a test that has a defect and passes when it should not.

- Green: Write code that makes the test pass, but no more.

- Refactor: Make the code simpler, especially remove duplicated code, with all the tests still passing. Keeping things continuously simple makes it much easier to add and change things later.

At least 80% of all the code in OWS have been written in TDD fashing. While this has not resulted as totally bug free code, the overall code quality has been good and porting and upgrading has not been a quality regression problem. Unit tests also serve as excellent sample code how to use the library and what it actually does. Of example, the module implementing ISO 8601 time parser http://en.wikipedia.org/wiki/ISO_8601 has 48 tests. Here is a typical one:

   def test_monthly_range_data(self):
       parsed = iso_time.parse("1995-04-22T12:00:00Z/2000-06-21T12:00:00Z/P1M")
       time = iso_time.TimePeriodicityComputed(
           "month", 1,
           iso_time.parse("1995-04-22T12:00:00").dt,
           iso_time.parse("2000-06-21T12:00:00").dt)
       assert_eq(time, parsed)

The test parses ISO 8601 string "1995-04-22T12:00:00Z/2000-06-21T12:00:00Z/P1M", then constructs a "TimePeriodicityComputed" class with constant being/end and periodicity, and compares that the resulting periodicity objects are the same.

Seeing 48 samples how to use the iso_time parser is very useful for new developers.


Independent Libraries.

When Object Oriented Programming hype was in its peak, there was this naive belief that reusable libraries would just pop out of any program development. Nowadays we know, that re-usability needs to be carefully designed to have any kind of hope for success.

The two big technical things to watch when designing for re-usability are dependencies and size. Dependencies are bad: If you have to install fifteen libraries to use one, there's a big temptation just to write your own. Separation of concerns is good: A library should do one thing and do it well. A time parsing module does not need a web server.

The OWS system contains an independent package "datafed", which can be installed separately without the OWS/WCS/WFS code. The package contains:

- ISO 8601 time parser

This module can parse single times, ranges, periodic sequences and enumerated sequences. Actually this module is so good that it should be published within the python community, since existing modules only deal with single ISO 8601 datetimes like http://pypi.python.org/pypi/iso8601/

- netcdf

Under unix variants its possible to use PyNIO which allows reading and writing netcdf files. Unfortunately it is not developed in cross platform manner, and compiling it to windows is next to impossible. Licensing also is unclear. That's why I developed "datafed.nc3" module, which is a thin python wrapper around the netcdf C library. This module can be used under windows and unix for any work.

- ncml

NetCDF Markup Language is an xml presentation of the netcdf file. The python based ncml interpreter can read an ncml file and create a netcdf file. Again a useful feature regardless for netcdf processing.

- model-checker.

This is a library developed by Michael Decker that checks the CF-compatibility of netcdf files. This is an essential tool to have, because understanding all the aspects of CF convention is difficult and it is very easy to make a mistake and never notice it.


Centralized Ticket system in SourceForge.


When I started the project October 2007, there was no need for a bug database. I could keep a to-do list for myself, and the darcs record list comments maintained the overall history of the things I had done. When Decker joined the project, we exchanged emails between us about the development, but that's not enough.

That's why in the future, changes that are more than minor bug fix have to go to the SourceForge Tickets, http://sourceforge.net/p/aq-ogc-services/tickets/ and should get at least a short review from somebody else. For example ticket #5 "query without time should return data with default time" http://sourceforge.net/p/aq-ogc-services/tickets/5/ is controversial. Now the server returns the whole time series and fixing it to return only one time is trivial. But there may already be clients, who don't put in TimeSequence because they want to get all the times. Fixing this defect to be more standards compliant may break some users queries. You can't just do it, you have to ask users what if this is something they do and how to change their client code if they install the upgrade.


Giving people the whole solution instead a .zip file with some code.

Getting started at all is very hard.

I have noticed, that when people download this WCS code, they don't know what is this all about and where to start. Most people will turn off because the size of all the documents: netCDF library, CF conventions and WCS specification.

Writing shorter documents like "Creating NetCDF CF Files" http://wiki.esipfed.org/index.php/Creating_NetCDF_CF_Files is helpful, but still not an easy place to start.

I have found most success in giving actual working code. People who want to publish grid data, first need the data in CF-netCDF format. It takes about ten lines of ncml to describe the dimensions and five for each variable. That's a few minutes work for me. Creating the empty cube is one line of python.

Adding the data is the most time consuming part. Usually the data is in daily CF incompatible netcdf slices which cannot be used directly as a source. Once you have a daily slice, it's just one line to append it to the CF-netCDF cube. Reading the data from a source usually is around 50 lines of python.

Giving this initial data compiler to someone takes less my time than attending a meeting to discuss how to add data to your netcdf files, and produces better results.



Evangelizing how to serve point data from your database.

Currently I'm working on the WCS for point data. Point data is a little trickier to configure, since SQL databases have no convention like CF that could be used as a metadata source. The tables and columns need to be mapped to the coverages and fields.

An interested user gave me the essential subset of the EBAS DB Schema. I created this database using sqlite http://sqlite.org/, sqlite is bundled with python. Then filled it with test data I downloaded from CIRA/VIEWS. After that, added the configuration mapping and the EBAS demo server is running.

Now I can just zip this up and send it to Norway. All they need to do is to install the framework, unzip my configuration and run the demo server. Changing it to the real server is just making the "open_connection" function to connect to their sybase server instead of sqlite. It should work out of the box, simple SQL has worked for me cross platform reliably. DBMS access may have subtle differences between vendors though.

I have documented three good normalized schemas for station point data in "WCS Wrapper Configuration for Point Data" http://wiki.esipfed.org/index.php/WCS_Wrapper_Configuration_for_Point_Data#Some_Different_DB_Schema_types

EBAS is the fourth kind, I had not seen it before, but the schema is properly normalized and makes perfect sense. There is no need to panic here: Using SQL views it is possible to make this DB look like the "One Data Table For Each Param" http://wiki.esipfed.org/index.php/WCS_Wrapper_Configuration_for_Point_Data#One_Data_Table_For_Each_Param and configuring becomes just a matter of listing the coverage and field names.

Dimensionless Vertical Coordinates from CF-1.4 conventions [Hoijarvi Decker Husar Schultz]

Hello,

I have successfully extracted vertical pressures from the formulas. There are a few places I don't really know:

For 'atmosphere_hybrid_sigma_pressure_coordinate' formula_terms = ap: ap b: b ps: ps p0: p0 and formula = p(n,k,j,i) = ap(k) + b(k) * ps(n,j,i)

this is what I found in the CF 1.4 http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.4/cf-conventions.html#dimensionless-v-coord

The two others:

       'atmosphere_hybrid_height_coordinate'

you have two instance cases:

formula_terms = "z: z a: a b: b orog: orog" formula = "z(k,j,i) = a(k) + b(k)*orog(j,i)"

and formula_terms = "z: z az: az bz: bz orog: orog" formula = "z(k,j,i) = az(k) + bz(k)*orog(j,i)"

I cannot find either. What the standard says:

formula_terms = "a: var1 b: var2 orog: var3"

formula = "z(n,k,j,i) = a(k) + b(k)*orog(n,j,i)"

The formula_terms keys seem to be wrong and variables are not in the netcdf

The last one, 'hybrid_sigma_pressure', should it be 'atmosphere_hybrid_sigma_pressure_coordinate' ?

formula_terms = "ap: hyam b: hybm ps: aps"

formula = "hyam hybm (mlev=hyam+hybm*aps)"

Is not in the standard at all.

Some examples:

http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?service=WCS&request=DescribeCoverage&version=1.1.2&identifiers=GEMAQ-v1p0_SR5SA_tracerm_2001

<cf:formula_terms>ap: ap b: b ps: ps p0: p0</cf:formula_terms> <cf:standard_name>atmosphere_hybrid_sigma_pressure_coordinate</cf:standard_name>

the p0 variable is only if you have 'a', not when you have 'ap'

The p0 variable is not found, and needs to be removed from the formula_terms.


This has formula, but no formula_terms http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?service=WCS&request=DescribeCoverage&version=1.1.2&identifiers=GEOSChem-v07_SR3EU_tracerm_2001 it looks it should be 'ap: a b: b ps: ps', the variable a is probably misleading name


There's an extra z: z in the formula_terms: http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?service=WCS&request=DescribeCoverage&version=1.1.2&identifiers=STOCHEM-v02_SR3EU_metm_2001

the lev axis is just zeroes http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?service=WCS&request=DescribeCoverage&version=1.1.2&identifiers=STOCHEM-v02_SR6SA_tracerm_2001

There may be others, but I think I have covered most of the cases.

Kari

Hi Michael, Martin:

Thanks for the note on the input data. We can use more time too here on the DataFed side. Mor importantly, we would really like to talk to you all more about the final report, which is not quite there either.

Martin, since the contract with EC/R just expired today, could we ask for an extension? Say a month/two? If for some reason that wont work, we could submit the 'final report' for this phase and do some of the fixing on our own?

What do you think?


Rudy


2010/9/30 Michael Decker <m.decker@fz-juelich.de>

   Hi Kari,
   Sabine is still cleaning up tons of problems in all those data files. Actually, of all the (original) files in HTAP_monthly, not one is without violation of the CF conventions.
   When Sabine is done with the cleanup, we will switch to serving the improved data. Until then it will probably be pretty difficult for you to calculate much from the files.
   I will let you know when we switch to the improved files, things should get easier then.
   Michael


   On 09/29/10 19:24, Kari Hoijarvi wrote:


       Hello,
       I have successfully extracted vertical pressures from the formulas.
       There are a few places I don't really know:
       For 'atmosphere_hybrid_sigma_pressure_coordinate'
       formula_terms = ap: ap b: b ps: ps p0: p0
       and formula = p(n,k,j,i) = ap(k) + b(k) * ps(n,j,i)
       this is what I found in the CF 1.4
       http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.4/cf-conventions.html#dimensionless-v-coord


       The two others:
       'atmosphere_hybrid_height_coordinate'
       you have two instance cases:
       formula_terms = "z: z a: a b: b orog: orog"
       formula = "z(k,j,i) = a(k) + b(k)*orog(j,i)"
       and
       formula_terms = "z: z az: az bz: bz orog: orog"
       formula = "z(k,j,i) = az(k) + bz(k)*orog(j,i)"
       I cannot find either. What the standard says:
       formula_terms = "a: var1 b: var2 orog: var3"
       formula = "z(n,k,j,i) = a(k) + b(k)*orog(n,j,i)"
       The formula_terms keys seem to be wrong and variables are not in the netcdf
       The last one, 'hybrid_sigma_pressure', should it be
       'atmosphere_hybrid_sigma_pressure_coordinate' ?
       formula_terms = "ap: hyam b: hybm ps: aps"
       formula = "hyam hybm (mlev=hyam+hybm*aps)"
       Is not in the standard at all.
       Some examples:
       http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?service=WCS&request=DescribeCoverage&version=1.1.2&identifiers=GEMAQ-v1p0_SR5SA_tracerm_2001


       <cf:formula_terms>ap: ap b: b ps: ps p0: p0</cf:formula_terms>
       <cf:standard_name>atmosphere_hybrid_sigma_pressure_coordinate</cf:standard_name>


       the p0 variable is only if you have 'a', not when you have 'ap'
       The p0 variable is not found, and needs to be removed from the
       formula_terms.


       This has formula, but no formula_terms
       http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?service=WCS&request=DescribeCoverage&version=1.1.2&identifiers=GEOSChem-v07_SR3EU_tracerm_2001
       it looks it should be 'ap: a b: b ps: ps', the variable a is probably
       misleading name


       There's an extra z: z in the formula_terms:
       http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?service=WCS&request=DescribeCoverage&version=1.1.2&identifiers=STOCHEM-v02_SR3EU_metm_2001


       the lev axis is just zeroes
       http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?service=WCS&request=DescribeCoverage&version=1.1.2&identifiers=STOCHEM-v02_SR6SA_tracerm_2001


       There may be others, but I think I have covered most of the cases.
       Kari


   -- 
   Michael Decker
   Forschungszentrum Jülich
   ICG-2: Troposphäre
   Tel.: +49 2461 61-3867
   E-Mail: m.decker@fz-juelich.de



-- Rudolf B. Husar, Professor and Director Center for Air Pollution Impact and Trend Analysis (CAPITA), Washington University, 1 Brookings Drive, Box 1124 St. Louis, MO 63130 +1 314 935 6054

CF Names, WCS names and Facets for search [Husar Hoijarvi Decker]

Hey guys, the metadata stuff is coming along really well...

Kari, would you

move Model facet to Instrument

move Experiment to Method

move MetChem to Domain

So there would only one new facet called tankard name

Also, lets make this uFIND HTAP - specific. Would you confine the uFIND search to Originator HTAP and do not show Originator facet.

Thanks R

On Wed, Sep 15, 2010 at 9:55 AM, Michael Decker <m.decker@fz-juelich.de> wrote:

   Hi Kari,


       It took me a while to go though the CF 1.4 document, and capture all the
       things I can meaningfully capture from that. There's still some
       attributes, that don't really match well.
   I have worked my way through all of the CF-1.x documents recently, so if there are things unclear we can also discuss them and maybe we can find the correct interpretation then.


       Example: A Field has Title, Abstract and Keywords.
       Title matches with long_name, fine
       the global attribute "comment" is moved to DescribeCoverage "Abstract",
       but nothing in variables would match Field Abstract.
   I'm not sure who came up with the comment->Abstract mapping. It might have been me. It seems to be the closest guess one can make from CF, but of course a comment is not an Abstract. In some occasions it works out pretty well and in others it makes pretty strange things showing up as a supposed "Abstract". However, if we only pull our metadata from CF, we can't really do better I think.


       We probably should make a wiki page documenting this mapping, and decide
       some own conventions of ours extending CF-1.4
   If we can find some meaningful extensions, then I'm all for it. But we have to be careful not to split too much from the "mainline" because then in the end our own extension will be pretty much useless. Ideally, we could propose something that finds its way back into the next CF-convention.


       At the same time, it's time to clean some internal names in the python
       dictionaries. We probably should stick 100% to CF names in the
       dictionaries, and translate them into WCS names in the capabilities and
       describcoverage generation time.
   Yes, that is a good idea. It needs to get more consistent now that we have an idea where this is going.
   I also added some tickets to the SF project. Especially #6 (make WCS NetCDF output CF-compliant) would be pretty important to work on I think. Else working with files coming out of the WCS will be a pain. Also, we expect this compliance from our sources and so we should comply to it as well.
   Another thing: I think we will really have to look at your minio again and how to get it compiled on linux. So far I had no success. It would help a big deal to unify things for all platforms.
   BTW: PyNIO 1.4.0 finally supports the attributes attribute instead of __dict__ now. I will see if I can clean that up in the code a bit this week. It should make things easier as well while we are working on the minio replacement.
   Did you have a look at my CommonUtils package? (http://repositories.icg.kfa-juelich.de/hg/CommonUtils/) I think it is helpful to bundle CF-related stuff somewhere. Probably we should try to adapt this for use with minio as well (and ultimately maybe completely switch to that).
   Michael
   -- 
   Michael Decker
   Forschungszentrum Jülich
   ICG-2: Troposphäre
   Tel.: +49 2461 61-3867
   E-Mail: m.decker@fz-juelich.de



   -- 
   Rudolf B. Husar, Professor and Director
   Center for Air Pollution Impact and Trend Analysis (CAPITA),
   Washington University,
   1 Brookings Drive, Box 1124
   St. Louis, MO 63130
   +1 314 935 6054

PyNIO update, CommonUtils package [Decker Hoijarvi]

On 9/15/2010 2:55 AM, Michael Decker wrote:

   > Yes, that is a good idea. It needs to get more consistent now that we have an idea where this is going.
   >
   > I also added some tickets to the SF project. Especially #6 (make WCS NetCDF output CF-compliant) would 
   be pretty important to work on I think. Else working with files coming out of the WCS will be a pain. 
   Also, we expect this compliance from our sources and so we should comply to it as well.

I'll chekc out this stuff as soon as I have time

   >
   > Another thing: I think we will really have to look at your minio again and how to get it compiled 
   on linux. So far I had no success. It would help a big deal to unify things for all platforms.
   >
   > BTW: PyNIO 1.4.0 finally supports the attributes attribute instead of __dict__ now. I will see 
   if I can clean that up in the code a bit this week. It should make things easier as well 
   while we are working on the minio replacement.

Could you send me the zipped package, I had hard time downloading it long time ago.

   >
   > Did you have a look at my CommonUtils package? (http://repositories.icg.kfa-juelich.de/hg/CommonUtils/) 
   I think it is helpful to bundle CF-related stuff somewhere. Probably we should try to adapt this 
   for use with minio as well (and ultimately maybe completely switch to that).

I haven't yet, I'll check it out.

Kari

On 9/15/2010 2:55 AM, Michael Decker wrote:

   >> Title matches with long_name, fine
   >> the global attribute "comment" is moved to DescribeCoverage "Abstract",
   >> but nothing in variables would match Field Abstract.
   > Example: A Field has Title, Abstract and Keywords.
   > I'm not sure who came up with the comment->Abstract mapping. It might have been me. It seems to be the 
   closest guess one can make from CF, but of course a comment is not an Abstract. In some occasions it works 
   out pretty well and in others it makes pretty strange things showing up as a supposed "Abstract". However, 
   if we only pull our metadata from CF, we can't really do better I think.

I'm pretty sure I did it in the original code. I'm now working to put the HTAP stuff in our catalog and browser, let's talk later about this.

Kari

WCS service got better in handling of CF attributes [Hoijarvi Decker]

For Michael, there's my new patch online.

Short description about the improved CF metadata propagation: If you look at

http://128.252.202.19:8080/BlueSky?service=wcs&version=1.1.2&request=DescribeCoverage&identifiers=CMAQ_DISP_D

fields PM25 and O3 have standard_name attribute, which is now used in WCS catalog:

http://webapps.datafed.net/catalog.aspx?table=WCS_fields&originator=bluesky

The CF related metadata is captured in our schemas: http://datafed.net/xs/CF_1_4_Metadata.xsd

Which declares 3 different metadata extensions: On the level of Coverage, Field and Axis. This now actually validates with the official WCS schemas.

It took me a while to go though the CF 1.4 document, and capture all the things I can meaningfully capture from that. There's still some attributes, that don't really match well.

Example: A Field has Title, Abstract and Keywords. Title matches with long_name, fine the global attribute "comment" is moved to DescribeCoverage "Abstract", but nothing in variables would match Field Abstract.

We probably should make a wiki page documenting this mapping, and decide some own conventions of ours extending CF-1.4

At the same time, it's time to clean some internal names in the python dictionaries. We probably should stick 100% to CF names in the dictionaries, and translate them into WCS names in the capabilities and describcoverage generation time.

Kari

Extended Metadata for WCS DescribeCoverage [Hoijarvi Husar Decker]

This CF standard took longer than I expected, I have to finish it tomorrow.

Here's the schema:

http://datafed.net/xs/CF_1_4_Metadata.xsd

The essential part is towards the end. This defines 3 different allowed metadata additions: One for coverage level, one for field level and one for axis/dimension level. All the relevant attributes are picked from http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.4/cf-conventions.html#attribute-appendix and assigned to correct section.


   <xs:element name="CoverageMetadata"  substitutionGroup="ows:AbstractMetaData" type="CoverageMetadataType" />
   <xs:complexType name="CoverageMetadataType">
       <xs:all>
           <xs:element name="Conventions" type="xs:string" minOccurs="0" />
           <xs:element name="calendar" type="CalendarType" minOccurs="0" />
           <xs:element name="history" type="xs:string" minOccurs="0" />
           <xs:element name="institution" type="xs:string" minOccurs="0" />
           <xs:element name="references" type="xs:string" minOccurs="0" />
           <xs:element name="source" type="xs:string" minOccurs="0" />
           <xs:element name="user_defined" type="UserMetadataType" minOccurs="0" />
       </xs:all>
   </xs:complexType>
   <xs:element name="FieldMetadata"  substitutionGroup="ows:AbstractMetaData" type="FieldMetadataType" />
   <xs:complexType name="FieldMetadataType">
       <xs:all>
           <xs:element name="ancillary_variables" type="xs:normalizedString" minOccurs="0" />
           <xs:element name="cell_measures" type="xs:string" minOccurs="0" />
           <xs:element name="cell_methods" type="xs:string" minOccurs="0" />
           <xs:element name="coordinates" type="xs:string" minOccurs="0" />
           <xs:element name="flag" type="FlagType" minOccurs="0" />
           <xs:element name="keywords" type="KeywordsType" minOccurs="0" />
           <xs:element name="add_offset" type="xs:double" minOccurs="0" />
           <xs:element name="scale_factor" type="xs:double" minOccurs="0" />
           <xs:element name="standard_name" type="xs:normalizedString" minOccurs="0" />
           <xs:element name="user_defined" type="UserMetadataType" minOccurs="0" />
           <xs:element name="valid" type="ValidType" minOccurs="0" />
       </xs:all>
   </xs:complexType>
   <xs:element name="AxisMetadata"  substitutionGroup="ows:AbstractMetaData" type="AxisMetadataType" />
   <xs:complexType name="AxisMetadataType">
       <xs:all>
           <xs:element name="axis" type="xs:string" minOccurs="0" />
           <xs:element name="flag" type="FlagType" minOccurs="0" />
           <xs:element name="formula_terms" type="xs:string" minOccurs="0" />
           <xs:element name="positive" type="PositiveType" minOccurs="0" />
           <xs:element name="standard_name" type="xs:normalizedString" minOccurs="0" />
           <xs:element name="user_defined" type="UserMetadataType" minOccurs="0" />
           <xs:element name="valid" type="ValidType" minOccurs="0" />
       </xs:all>
   </xs:complexType>

WCS Server Code Progress [Husar Vik Eckhardt Schultz Decker Hoijarvi]

Hello Aasmund and Paul,

I apologize for not reporting to you sooner on the WCS server software project. During the Summer, Kari has cleaned up and improved the structure of the WCS station-point server code. The saga of incorporating suitable metadata into the service still continues. I also begun the description of the WCS server as part of the HTAP Data Network Pilot project in collaboration with Martin Schultz's group in Juelich. In fact, the WCS server software was co-developed (and code-shared) by Kari Hoijarvi at CAPITA and Michael Decker at Juelich, with Martin and I directly involved in the design.

Anyway, the situation is that the WCS station-point server code is now accessible through SourceForge and the initial description is on the GEO Air Quality Community of Practice workspace, links below. While the code is based on the grid data server, it has substantial modifications for the WCS delivery of point monitoring data.

WCS Server Software

WCS server software for grid data

WCS server software for station-point data

I hope that our goal of making the WCS server 'portable' holds water. According to Kari, following the installation of the code, you will need only to fill out the configuration files and viola! :). I am sure, however, glitches will appear and that with your applications and experience, the server can be made more suitable and reusable for HTAP or any other of your projects. In any case, the application of the server code to the EMEP data in your SQL server should be doable with modest effort.

Kari indicates that he will be delighted to respond to any questions you might have and to work with you. Also, I will be calling you soon ...

Best,

Rudi -- Rudolf B. Husar, Professor and Director Center for Air Pollution Impact and Trend Analysis (CAPITA), Washington University, 1 Brookings Drive, Box 1124 St. Louis, MO 63130 +1 314 935 6054


WCS WFS Mix [Hoijarvi Decker Husar]

I just took a look at your changes. So far I did not understand everything you are trying to do there, but having WFS related code in wcs_std seems a little strange. What exactly is it doing?

Anyway, a short test shows that there seem to be no immediate problems, so I put the new version online.

Michael


   On 09/10/10 23:38, Kari Hoijarvi wrote:
   >
   > Please pull my metadata in describecoverage changes.
   >
   > You need to run owsadmin wcs_prepare afterwards.
   >
   > Kari


   -- 
   Michael Decker
   Forschungszentrum Jülich
   ICG-2: Troposphäre
   Tel.: +49 2461 61-3867
   E-Mail: m.decker@fz-juelich.de


   > Do you mean the line range 451 ... 459
   >
   > if service == 'WFS':
   > version = axisinfo['version']
   > wfs_u = (
   > "%s/%s?service=WFS&Version=%s&Request=GetFeature&typename=%s&filter=field:%s&outputFormat=text/csv"
   > %
   > (self.home, self.provider, version, urllib.quote_plus(coverage_name),
   > urllib.quote_plus(field_name)))
   > metadata = etree.SubElement(axisroot, self._sub_name('Metadata'))
   > metadata.attrib[owsutil.xlink_name('type')] = 'simple'
   > metadata.attrib[owsutil.xlink_name('href')] = wfs_u

Yes, that part. I don't really understand what you use it for. Are you somehow mixing WCS and WFS output?

   > Did you check out out coverage browser?
   > http://webapps.datafed.net/catalog.aspx?table=WCS_coverages&model=mozartgfdl-v2&experiment=sr3ea%2csr3eu&originator=htap

Yes, I saw it. The data sources are manually added I guess? You could take out either HTAP or HTAP_monthly as they contain exactly the same thing. I renamed to HTAP_monthly later but left a symlink as we originally published the HTAP link.

   Michael
   -- 
   Michael Decker
   Forschungszentrum Jülich
   ICG-2: Troposphäre
   Tel.: +49 2461 61-3867
   E-Mail: m.decker@fz-juelich.de


I added the describecoverage WFS explanation to

http://wiki.esipfed.org/index.php/WCS_Access_to_netCDF_Files#Architecture_of_datafed_WCS_server

and in the next section I added how to filter by loc code

http://wiki.esipfed.org/index.php/WCS_Access_to_netCDF_Files#Processing_of_custom_SQL_database

Kari

Thanks Kari,

Using the WFS protocol to encode 'location table' in Station-Point datasets is indeed a significant step toward standards-based access to AQ data

Thanks

R


On Sat, Jul 24, 2010 at 12:37 AM, Kari Hoijarvi <hoijarvi@seas.wustl.edu> wrote:

    I put a preliminary WFS to the services. It only has GetFeature: This was basically a copy + rename operation from the WCS, just a direct query without any filtering. There are no GetCapabilities or DescribeFeatureType calls, there's no bbox filters, but it's a real WFS call that just returns the CSV formatted location table.


   In CIRA/VIEWS DescribeCoverage:
   http://128.252.202.19:8080/CIRA?service=WCS&version=1.1.2&Request=DescribeCoverage&identifiers=VIEWS
   The URL to locations is now a WFS call:
   http://128.252.202.19:8080/CIRA?service=WFS&Version=1.0.0&Request=GetFeature&typename=VIEWS&outputFormat=text/csv
   I haven't finished the registration, so just two parameters are in. I'll write a reader that gets all of it from the SQL DB.
   You can browse it too:
   http://webapps.datafed.net/datafed.aspx?wcs=http://128.252.202.19:8080/CIRA&coverage=VIEWS
   so webapps calls WCS on 128.252.202.19 which calls colorado. It's surprisingly responsive.
   I'm updating the documentation now.
   I have trouble with http://sourceforge.net/p/aq-ogc-services/home/ it does not allow me to add files. Hopefully that's just s glitch that's fixed soon. They're still available from http://datafed.net/ows/


   Kari



-- Rudolf B. Husar, Professor and Director Center for Air Pollution Impact and Trend Analysis (CAPITA), Washington University, 1 Brookings Drive, Box 1124 St. Louis, MO 63130 +1 314 935 6054

Re: Reworking CF-metadata handling in the WCS

A lot has to happen in matters of CF-conformance in the server. I'm also planning to work on that within the next months. As a by-product of our CFchecker, I have also developed some sort of low-level python library for CF-NetCDF related queries and other small but useful tools. It might save you some work to look at it and ultimately of course it would be great if we could get it to run without PyNIO, which it is based on right now. The sources are available from http://repositories.icg.kfa-juelich.de/hg/CommonUtils/ (I gladly take suggestions for a better name). There are several sub-modules. It could also be interesting to integrate your iso_time module into this context as it is certainly useful for more applications than only the OWS server.


Which reminds me: we should try to find out how we can make minio compile on linux. So far I had no success and gave up again due to lack of time.

Michael

On 09/10/10 15:51, Kari Hoijarvi wrote:

   >
   > I've been working on copying CF-metadata from netcdf files into the WCS
   > DescribeCoverage document. This enables us to capture all the CF
   > information, and add non-CF information as used-defined metadata.
   >
   > Since the wcs 1.1 schemas have this extension, I was able to define
   > actual W3C schema so we have real validation of this.
   >
   > Some of this stuff is quite important:
   >
   > - standard name
   > - Correct positive=up/down in Z dimension.
   > - ancillary_variables: this attribute tells what variables are in fact
   > just metadata for this variable.
   > http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.4/cf-conventions.html#ancillary-data
   >
   -- 
   Michael Decker
   Forschungszentrum Jülich
   ICG-2: Troposphäre
   Tel.: +49 2461 61-3867
   E-Mail: m.decker@fz-juelich.de

Content Types and Facet values [Hoijarvi Decker Husar]

Ok, I just changed the MACC_bnds provider's resoltion to "Hour" and put all the Vertical entries to "Unknown" until we have an idea what to use there. Maybe something like "Layers" or so could describe it correctly?

Michael

   On 09/10/10 20:12, Kari Hoijarvi wrote:
   >
   > Vertical: Column is for satellite aerosol data, since it aggregates the
   > whole column. Surface is for surface observations.
   >
   > rhusar: What should 4D models use?
   >
   >
   > TimeRes:Hour is proper for 3-hour data.
   >
   >
   > Kari
   >
   >
   >
   > On 9/10/2010 9:42 AM, Michael Decker wrote:
   >> Hi Kari,
   >>
   >> I have added the keywords as requested. But I'm not sure if all make
   >> sense in our case. For example the HTAP model data comes from several
   >> groups. I'm also not really sure what you want to describe with the
   >> Vertical keyword.
   >> Another problem: We are serving some ECMWF data for the MACC project
   >> now (MACC_bnds provider) and this data is 3-hourly. How to tag that?
   >> (I used "TimeRes:3Hour" for now)
   >>
   >> I will review your changes for mime types and see what I can find out,
   >> but my focus is not entirely on the OWS server right now - it should
   >> be again soon, though.
   >>
   >> Michael
   >


Hi,

On 09/10/10 20:08, Kari Hoijarvi wrote:

   >
   > Your solution seems to work fine for me. I didn't realize, that you can
   > call init many times.
   >
   > in Lib/SimpleHTTPServer.py:
   >
   > extensions_map.update({
   > : 'application/octet-stream', # Default
   > '.py': 'text/plain',
   > '.c': 'text/plain',
   > '.h': 'text/plain',
   > })
   >
   > So that's why .conf was application/octet-stream. We should define all
   > explicitly, to avoid mismatches one this is hosted via apache or IIS etc...
   >

Feel free to add all the types you want into the mime.types file. Right now I don't see any use for further types from our side.


Michael

Datafed Juelich Connection [Husar Hoijarvi]

Kari,

Today we should take a break from documentations. The next topic is DataFed connection to the Juelich server. For the HTAP report we have to demonstrate

-- uFIND as the catalog of HTAP datasets form Julich and DataFed -- DataFed access to all Julich data ( As I was checking, the HTAP datasets dont work...?? )

The we have to talk to about how to tag the HTAP datasets .... Talk later

R

Next Heading

Standard Names in HTAP [Husar Hoijarvi Decker Schultz]

Perfect:

   float vmr_o3(time, lev, lat, lon) ;
       vmr_o3:standard_name = "mole_fraction_of_ozone_in_air" ;
       vmr_o3:long_name = "O3" ;
       vmr_o3:units = "mole mole-1" ;
       vmr_o3:cell_methods = "time: mean" ;
   float vmr_co(time, lev, lat, lon) ;
       vmr_co:standard_name = "mole_fraction_of_carbon_monoxide_in_air" ;
       vmr_co:long_name = "CO" ;
       vmr_co:units = "mole mole-1" ;
       vmr_co:cell_methods = "time: mean" ;
   float vmr_no(time, lev, lat, lon) ;
       vmr_no:standard_name = "mole_fraction_of_nitrogen_monoxide_in_air" ;
       vmr_no:long_name = "NO" ;
       vmr_no:units = "mole mole-1" ;
       vmr_no:cell_methods = "time: mean" ;


The standard name is from CF. I need to update the Capabilities processor to add the metadata into the field.

Kari

wcs catalog and 3rd party stuff [Hoijarvi]

Popup links:

http://webapps.datafed.net/catalog.aspx?table=WCS_coverages

HTAP demo.

http://webapps.datafed.net/datafed.aspx?wcs=http://htap.icg.kfa-juelich.de:58080/HTAP&coverage=EMEP-rv26_SR1_metm_2001&param_abbr=temp&scale_min=220&scale_max=290

Even some other services kind of work, at least for map view.

http://localhost:1119/dvoy_services/datafed.aspx?wcs=http://acdisc.sci.gsfc.nasa.gov/daac-bin/wcsL3&coverage=OMTO3d:OMI%20Column%20Amount%20O3:ColumnAmountO3

A major problem is good defaults for each coverage. Scale min/max, time ranges should not be too big etc.

Kari

netCDF-CF vs. WCS text [Husar Hoijarvi]

http://wiki.esipfed.org/index.php/CF_Coordinate_Conventions#The_nice_match_between_netCDF-CF_and_WCS_1.1

Keywords for HTAP [Hoijarvi Decker]

Hello,

we're putting together our display here:

http://webapps.datafed.net/catalog.aspx?table=WCS_services http://webapps.datafed.net/catalog.aspx?table=WCS_coverages


To enable the faceted search, like here: http://webapps.datafed.net/catalog.aspx?table=WCS_coverages&distributor=datafed

we need some standard keywords: http://128.252.202.19:8080/static/NASA/wcs_capabilities.conf

So could you please add the following keywords to your wcs_capabilities.conf:

KEYWORDS:

Domain:Aerosol, Platform:Model, Instrument:Unknown, DataType:Grid, Distributor:Juelich, Originator:Juelich, TimeRes:Month, Vertical:Column, TopicCategory:climatologyMeteorologyAtmosphere

Your existing keywords can stay.

Another patch: I added patch "some custom mime types in ows.py" to enable viewing .conf files as text/plain. In unix mimetypes seems to support configuration files, but in windows I had to add them myself in the code. Could you check how to define mime-types for stuff, For CSV files I want to have the standard text/csv instead of application/octed-stream.

Kari



Catalog Progress [Hoijarvi Husar]

Catalogs are coming up.

Browse Services:

http://webapps.datafed.net/catalog.aspx?table=WCS_services&domain=aerosol


Browse Coverages

http://webapps.datafed.net/catalog.aspx?table=WCS_coverages&domain=aerosol


TODO:

- email to Decker to add keywords to his service

- other services metadata

- Button to pop up our browser from these catalogs

- Add service menu next to add layer in browser

More, lets talk.

Kari

Documentation Work [Hoijarvi Husar]

On 8/27/2010 11:14 AM, Rudolf Husar wrote:

> Kari, would you point me to the writing you done on WCS wiki yesterday and today. I cant find them > > Thanks >

They are revisions of the links from the powerpoint:

http://wiki.esipfed.org/index.php/WCS_Wrapper_Configuration#Enter_Provider_Metadata http://wiki.esipfed.org/index.php/WCS_Wrapper_Configuration_for_Cubes http://wiki.esipfed.org/index.php/Creating_NetCDF_CF_Files

http://wiki.esipfed.org/index.php/WCS_Wrapper_Configuration_for_Point_Data#Storing_Point_Data_in_a_Relational_Database

http://wiki.esipfed.org/index.php/WCS_Wrapper_Configuration_for_Point_Data#Location_Table_Configuration

http://wiki.esipfed.org/index.php/WCS_Wrapper_Configuration_for_Point_Data#Data_Table_Configuration


http://wiki.esipfed.org/index.php/WCS_Wrapper_Configuration_for_Point_Data#GetCoverage

Next Heading

Hello

Next Heading

Hello

Next Heading

Hello