WCS Development Issues

From Earth Science Information Partners (ESIP)

Back to WCS Wrapper

Miscellaneous Things that have come up in design, development and deployment of datafed WCS system. This includes email-conversations and personal observations.

Using Datafed client on NILU test DB with slightly irregular time dimension [Hoijarvi]

http://webapps.datafed.net/datafed.aspx?page=test/NILU

The map view is an aggregate over 30 months, so it's kind of slow.

Since there is some problem with the time periodicity in EBAS, I made the map view to aggregate over one month. This demonstrated seasonal browsing.

http://webapps.datafed.net/datafed.aspx?page=test/NILU_month

Running WCS on NILU EBAS schema using sybase [Hoijarvi Eckhardt]

Hello,

It looks actually pretty good. I can browse it better than I expected with the first try. But please check your DB connection. Currently your server reports an error:

Layer: 0, Origin: 0 Unable to connect: Adaptive Server is unavailable or does not exist

I made a test page for tweaking and testing the settings:

http://webapps.datafed.net/datafed.aspx?page=test/NILU

As soon as you have the DB online, I'll improve it. Currently there is time range aggregate on map view so that you can see something.

Other than that, the thing looks good, here are my first comments:

   On 10/27/2010 8:27 AM, Paul Eckhardt wrote:
   > Those tests revealed some more questions:
   >
   > 1) for one coverage you assume a constant measurement interval, not only
   > across all locations, but also over the whole time series.

This is can be dealt in many ways. In general, the data fields in a coverage should share dimensions. So if you have data that actually does share dimensions, put them in one coverage. If you have hourly data and aggregated daily averages, make two coverages. If most of the locations are the same, data can be in one coverage. If location tables are different, make separate coverages. That's the easy part.

It seems, that EMEP data does not fit into this simplistic form, and therefore we have to think how to configure the WCS.

   > This is generally not the case for emep data:
   > 1a) every station has their own measurement cycle (some changing filters
   > at midnight, others at 6am, this will again differ for the precipitation
   > samplers.
   No comment about the importance or consequences about this.
   > 1b) not all measurements have the same resolution (everything from
   > minute resolution to mothly averages is in principle possible)

This probably results in separating data by time resolution to separate coverages.

   > 1b) some stations change their schedule over time.
   > 1c) there is no guarantee, that a timeseries is complete (i.e. no gaps)

This has not been a problem. For example CIRA/VIEWS has data twice a week, Still, we consider it a daily dataset.

http://webapps.datafed.net/datafed.aspx?wcs=http://128.252.202.19:8080/CIRA&coverage=VIEWS&field=SO4f&datetime=2007-02-08

In general, when the field is normalized into triple (loc_code, datetime, value), missing data is just a missing row.

   > 1d) time series are not necessarily unique (there might me multiple
   > timeseries for one parameter at one location)
   

Does this mean, that there could be hourly and daily SO4? That's no problem, again those two are separate coverages.

If the data is truly random, like 1-20 measures at random times during the day, and separate in each location, then we cannot have periodic time dimensions. The time dimension must then be just a range, not with periodicity or enumerated times.

I haven't tested such coverages, yours might then be the first one. There's no technical reason why it should be a huge problem.

   > To address 1b, I added a filter criteria to only use timeseries with a
   > resolution of 1 day.

Seems to work for this test!

   > For the rest, I'm a bit lost. Maybe we need to aggregate (homogenize the
   > intervals) for all measurements in an intermediate database, in order to
   > fit in the OGC model?
   > Do we need to aggregate a unique value for each station, parameter, and day?

The wonderful thing about SQL is, that it gives you this option but does not enforce it. We compiled GSOD, Global Summary Of the Day, and did a ton of on-the-fly calculations using SQL views, producing what we needed directly from the original data. Later, some views were materialized for performance reasons. If you feel so, go ahead and create tables for WCS views and populating them from data.

   > 2) the parameter naming conventions are not easily convertible with the
   > emep naming.
   > As an example there is SO4f (which I interpreted as SO4 particles in the
   > pm2.5 fraction - is this correct?), but emep distinguishes between
   > sea-salt-corrected SO4 and total SO4. Same for SO4t (which I understand
   > is the whole aerosol fraction, without a size cutoff?).
   > Please see http://tarantula.nilu.no/projects/ccc/nasaames/complist.html
   > for a (hopefully up-to-date?) list of components/matrixes used in emep.

We are using CF standard names:

http://webapps.datafed.net/table.aspx?database=catalog&table=standard_names

compiled from http://cf-pcmdi.llnl.gov/documents/cf-standard-names/standard-name-table/15/cf-standard-name-table.xml

If you have two same measures of SO4 done with two different type instruments and you want to publish both, go ahead and call the field SO4_instrA and SO4_instrB but choose the standard_name attribute from the CF table if possible.

I'm not an expert in CF naming, but sea-salt-corrected SO4 and total SO4 sound like two different CF standard names. Anyway, your fields can be called sea_salt_corrected_SO4 and total_SO4, that's fine. They may have same or different standard name, that's no problem.

   >
   > Minor issues:
   > If I use the datafed browser, I can see some hiccups. I'm not sure if
   > this is due to my wrong configuration of the ows software...
   >
   > 1) the map layer is empty. Are there no features for the european
   > region, or is there a problem with the geo-references from my side?
   I had left North America map to the default view. I changed it into world map. The browser  should pick a proper map based of catalog defaults.
   > 2) the defuault time displayed in the browser is 2010-07-29T06:00:00,
   > which i can not relate to any response from the wcs server. The
   > describeCoverage response contains:
   > <TemporalDomain>
   >   <TimePeriod>
   >    <BeginPosition>1972-01-01T06:00:00Z</BeginPosition>
   >    <EndPosition>2007-01-07T06:00:00Z</EndPosition>
   >    <TimeResolution>P1D</TimeResolution>
   >   </TimePeriod>
   > </TemporalDomain>

There is a problem in "make_time_dimension" function in NILU_config.py. I filled the test database with daily data.

The time_min must have the same resolution as the data. So if you have daily data, it must not have hours. If it's hourly, it cannot have minutes.

So change

   time_min = iso_time.parse(row[i_time_min]).dt
   time_max = iso_time.parse(row[i_time_max]).dt

to

   time_min = iso_time.parse(row[i_time_min]).dt.date()
   time_max = iso_time.parse(row[i_time_max]).dt.date()

and the periodic dimension should work. I have to add a check for this in the iso_time module

The SQL query is then compiled by datetime='2010-07-29T00:00:00' and requires precision, one second off and it won't return anything. If the data is not on precise time intervals, you can in the browser set time window access for map view, to get all the data from datetime plus minus short period.


   > 3) changing the time field in the browser updates the map, but somehow
   > the timeseries display is not changed (still showing 2010)
   > I can manually "zoom" out&  in the time axes of the time series plot though.
   Timeseries display should update the cursor, blue vertical line. It does not update the time zoom.
   > 4) once I get the timeseries displayed, the y-scale seems to be fixed to
   > [0.0, 1.0]. What might be the reason for this? The values reported by
   > the wcs are correct, and agree with what is plotted (at least the values
   > below 1.0)

You can set a different scale. For each field, I have to set a default zoom, currently manually, but I'm planning to monitor registered WCS's and then I can have a script generating a better default. Click on the "Service Program" button to change settings.

   >
   > However, here's the best display I could get:
   > 2010-10-27-141538_620x691_scrot.png
   > This is a display for 15th Jan 2006, 06:00. The cross at the location of
   > Birkenes, Norway and the corresponding timeseries plotted correctly.

That's better than I expected. As soon as you have the DB online, I'll take another look.

   > Kind regards
   >
   > Paul
   >
   >

Two more things, make sure you don't have critical passwords in your python code. The sources are http accessible. Create a read-only account and for paranoia hide the connection module outside OWS/web/static.

Also in http://knulp.nilu.no:8080/static/NILU/wcs_capabilities.conf change the "Distributor:DataFed" to "Distributor:NILU"

You should add "IS NOT NULL" in the filters for location and data views, as well as filter out data with -999 as missing marker. I never use nulls or -999 except when I have to filter them out at the lowest level data access.

Greetings, Kari

Existing HTAP Services [Hoijarvi Decker]

Look into the point provider how to set up database for point WCS

The CIRA provider shows how to make a custom processor for a database where you cannot create views.

Kari

   On 10/27/2010 2:57 AM, Michael Decker wrote:
   > Hi,
   >
   > those should all be valid and unique providers. HTAP_monthly* should be fully CF-compliant, 
   in the others that might not be entirely so, but that might improve in the future.
   >
   > Another thing:
   > I guess we will also have a look at point station data soon. Do you have some documentation 
   about what you have done there so far or some helpful query links for me to look at? I did not 
   look at the source code for that at all so far but probably will within the next few weeks...
   >
   > Michael
   >
   > On 10/26/10 17:37, Kari Hoijarvi wrote:
   >> Hi,
   >>
   >> I have registered these services into our catalog:
   >>
   >> http://webapps.datafed.net/HTAP.uFIND
   >> http://webapps.datafed.net/catalog.aspx?table=WCS_services&distributor=juelich
   >>
   >>
   >> Are all these valid, or are some of them duplicates?
   >>
   >> HTAP_FC_pressure
   >> HTAP_monthly
   >> MACC_bnds
   >> HTAP_FC
   >> HTAP_FE
   >> HTAP_FE_hourly
   >> HTAP_monthly_pressure
   >> HTAP_FE_pressure
   >>
   >
   >

Getting Timeseries from WCS [Hoijarvi Falke]

Expand the bbox a little bit:

http://webapps.datafed.net/cov_73556.ogc?SERVICE=WCS&REQUEST=GetCoverage&VERSION=1.0.0&CRS=EPSG:4326&COVERAGE=DD&TIME=2001-01-01T00:00:00/2001-01-31T00:00:00&BBOX=-119.6109,36.2301,-117.6109,38.2301,0,0&WIDTH=1&HEIGHT=1&DEPTH=-1&FORMAT=CSV

There is a known problem with rounding. In the version of OWS I should fix this.

By the way, you can call the wcs 1.1 services via our 1.0 proxy, but don't send these links anywhere. It's only set up so that our browser will work. The id 73556 is completely random and may change any time. So use the real service instead:

http://ww10.geoenterpriselab.com/dd?service=WCS&version=1.1.2&Request=GetCoverage&identifier=dd&BoundingBox=-119.6109,36.2301,-117.6109,38.2301,urn:ogc:def:crs:OGC:2:84&TimeSequence=2001-01-01T00:00:00/2001-01-31T00:00:00&format=image/netcdf&store=false

store=false works fine with firefox.

Kari

   On 10/26/2010 3:53 PM, Falke, Stefan R (IS) wrote:
   >
   > Kari,
   >
   >  
   >
   > I’m trying to get a GetCoverage example for a time series request 
   on the degree day dataset. I’ve created the following using the TimeSeries 
   WCS Query form in the DataFed Browser but it doesn’t return anything. 
   Do you see anything wrong with the query? I made the request for a single 
   lat-lon point, which creates an unusal bbound in the WCS request but 
   I don’t know if that’s the issue:
   >
   >  
   >
   > http://webapps.datafed.net/cov_73556.ogc?SERVICE=WCS&REQUEST=GetCoverage&VERSION=1.0.0&CRS=EPSG:4326&COVERAGE=DD&TIME=2001-01-01T00:00:00/2001-01-31T00:00:00&BBOX=-118.6109,37.2301,-118.6109,37.2301,0,0&WIDTH=1&HEIGHT=1&DEPTH=-1&FORMAT=CSV
   >
   >  
   >
   > I generated this from this page:
   >
   > http://webapps.datafed.net/datafed.aspx?page=NGC/DegreeDays
   >
   >  
   >
   > Thanks,
   >
   > Stefan

GeoEnterpiseLab got new service online [Hoijarvi Roberts Falke]

   You could create yourself an organization here:
   http://webapps.datafed.net/table.aspx?database=catalog&table=organizations&mode=edit&edit=7019&edit_mode=dup&message=Creating+new+from+existing+item.
   Choose a nice looking abbreviation like GEOEL, enter the new information and save. Then you can use that abbreviation as distributor and originator.
   Kari
   On 10/26/2010 12:55 PM, Roberts, Gregory (IS) wrote:
   > Kari,
   >  
   > I updated the wcs_capabilities.conf and also added index.html to each wcs directory.  
   One thing though, I had to remove the Distributor:NorthropGrumman and Originator:NorthropGrumman 
   from the keywords, since we are not allowed to show where the data is from.  Let me know if these are correct.
   >  
   > They're there.
   >
   > http://webapps.datafed.net/catalog.aspx?table=WCS_services&sort=domain+asc
   >
   > Greg, could you please add the keywords to wcs_capabilities.conf so that the catalog search facility can be used.
   >
   >
   > Kari
   >


http://webapps.datafed.net/catalog.aspx?table=WCS_coverages&originator=GEL&pagesize=50

Looks good.

Kari

   On 10/26/2010 2:08 PM, Roberts, Gregory (IS) wrote:
   > I created GeoEnterprise Lab (GEL and updated all the wcs_capabilities.conf with Distributor:GEL, Originator:GEL
   >  
   >

Unit tests to fight regression bugs

I pulled and reviewed your patches already.

Unit tests are great indeed and i'm using them for my own code as well. The problem with the OWS unit tests is that many of them never worked for me because of platform differences, etc. So I started ignoring them. I agree that this is a very poor solution and we should try to make them work for everybody. I will look into them again.

Michael

   On 10/25/10 17:01, Kari Hoijarvi wrote:
   > Thanks.
   >
   > You should add a unit test for that. They pay off in the long run,
   > especially in cross-platform porting when I write minio.
   >
   >
   >
   > There are a few patches that you should pull, review, and publish:
   >
   > * fixed windows-only is_string call
   >
   > that fixes a windows-only wrong name bug
   >
   >
   > * use_nio flag by sys.platform
   >
   > A little more convenient way to set use_nio
   >
   >
   >
   > * added SupportedCRS and SupportedFormat to wcs capabilities
   >
   > This probably does nothing more but adds 4 nodes to each
   > coveragedescription in the capabilities, but I put it there anyway.
   >
   >
   > Kari
   >
   >
   > On 10/25/2010 2:52 AM, Michael Decker wrote:
   >> It's fixed and published now. When I changed the code for X/Y
   >> filtering, I forgot the T filter...
   >>
   >> Michael
   >>
   >> On 10/21/10 22:56, Kari Hoijarvi wrote:
   >>> This used to work, it was in my unit tests.
   >>>
   >>> http://htap.icg.kfa-juelich.de:58080/HTAP_monthly?Service=WCS&Version=1.1.2&Request=GetCoverage&Identifier=GEMAQ-v1p0_SR5SA_tracerm_2001&Format=application/x-netcdf&Store=false&TimeSequence=2001-04-16&RangeSubset=ap&BoundingBox=-180,-90,360,90,urn:ogc:def:crs:OGC:2:84
   >>>
   >>>
   >>>
   >>> It reports: unhashable type: 'list'
   >>>
   >>> Is this a newly introduced regression bug?
   >>>
   >>> Kari
   >>>
   >>>
   >>
   >
   -- 
   Michael Decker
   Forschungszentrum Jülich
   Institut für Energie- und Klimaforschung - Troposphäre (IEK-8)
   Tel.: +49 2461 61-3867
   E-Mail: m.decker@fz-juelich.de

NILU/EBAS Gets Going

Fantastic!

Best

R

   2010/10/21 Paul Eckhardt <Paul.Eckhardt@nilu.no>
   Hi Rudy,
   I actually started testing Kari's version yesterday. I got his prototype (using the sqlite database) running in a local virtual Linux machine so far. Next, I will look into connecting our Sybase database directly. I'm currently starting to get into it. Can I contact you next week? Then I'll know more about how far I could get on my own, or at least will be able to ask specific questions.
   Thanks for your support so far!
   Paul


   On 22.10.2010, at 01:05, Rudolf Husar <rhusar@me.wustl.edu> wrote:
   >     Hello Paul and Aasmund,
   >
   >     Hope life is treating you well... maybe way too busy but well.
   >
   >     Could we have a brief session on the WCS server implementation for 
   EMEP data on EBAS? Kari is now quite familiar with your data structure 
   and he would be happy to help connecting the SQL server to the WCS server software.
   >
   >
   >     Best regards,
   >
   >     Rudy
   >
   >      
   >
   >     On Fri, Oct 8, 2010 at 3:04 PM, Kari Hoijarvi <hoijarvi@seas.wustl.edu> wrote:
   >
   >          Hello all,
   >
   >         I have configured our WCS server for NILU database schema. 
   I copied stations and test data from CIRA/VIEWS, filled up a sqlite database and here it is online:
   >
   >         http://128.252.202.19:8080/NILU
   >
   >         You can also view it online via our browser. 
   http://webapps.datafed.net/datafed.aspx?wcs=http://128.252.202.19:8080/NILU&coverage=EBAS&field=SO4f
   >
   >         The defaults are bad, click the next day in time controller to go to 2006-11-13 and 
   >          you'll see some data.
   >
   >         To install this, you need a Windows machine with a public IP address.
   >         The project is at http://sourceforge.net/p/aq-ogc-services/home/
   >         Installation instructions. http://wiki.esipfed.org/index.php/WCS_Wrapper_Installation_WindowsOP
   >         I made simplifications to the code, the netcdf libraries are now an option.
   >         Once you have the point demo provider working, you can download your own stuff:
   >         http://128.252.202.19:8080/static/NILU/NILU.zip
   >         Unzip this and move the NILU folder to C:\OWS\web\static\ just where the point folder is. You should have now the NILU mock server running.
   >         The files under web\static\NILU
   >         EBAS.db: This is the mock sqlite database with dummy test data.
   >         index.html: Your home page
   >         wcs_capabilities.conf: Keywords, Contact information etc.
   >         NILU_wcs.py: The WCS server component.
   >         NILU_wfs.py: The WFS server component for the location table.
   >         EBAS_data.py: The main script that created the EBAS.db
   >         EBAS_sql.py: The SQL commands for EBAS_data.py
   >         NILU_config.py: The python dictionary/script that contains the WCS configuration
   >         To configure this to use the real sybase DBMS:
   >         Edit the EBAS_sql.py and change the def connect() method to return the real db connection. 
   >         You can either use direct sybase module for python or go via ODBC.
   >         The EBAS schema is properly normalized, which allows using SQL as a configuration tool, 
   >         using SQL views. If you absolutely don't want to create views, then we need to configure this
   >         WCS by adding rather complicated joins and aliases. Views are a lot simpler, and give a chance 
   >         later for optimizations by taking a snapshot out of them. The script that creates WCS_location 
   >         view, and a view like [WCS_SO4f] for each parameter, is in EBAS_sql.py. The methods 
   >         create_location_view and create_wcs_view do the trick. They were called when EBAS_data.py 
   >         created the mock database.
   >         Then announce the fields in NILU_config.py. The dictionary conf_dict in method compile_config is 
   >         quite self-explanatory. It lists the parameter names units, and keywords. The script the adds the 
   >         location dimension axis description in each field. After getting a hand-edited configuration 
   >         working, I'd recommend reading the field names from the database, so that the configration is 
   >         automatic. In ows-point-1.2.3.zip I'm demonstrating that technique in CIRA, I scan the VIEWS 
   >         database and compile the fields myself.
   >         The decision what is a coverage and what is a field is quite simple. A coverage is a collection 
   >         of fields that share the time dimension and spatial boundaries. So if you have same data collected 
   >         with different intervals, create an own coverage for each time interval.
   >         Now the fields share also the location dimension. The standard does not require this, if some station 
   >         does not collect a parameter, that field should not have that station in the location table. 
   >         This is in the todo list. Also there is a limitation, that you can query only one field at a time 
   >         from a point coverage. This will be addressed to allow multiple fields like in NetCDF cube queries.
   >         Good luck with installation. I will be in the office every weekday, feel free to email, skype or 
   >         call 1-314-935-6099 if you have questions. I'll be happy to help. Please remember, that this is 
   >         work in progress. I have discussed with Michael Decker about restructuring to configuration 
   >         dictionary, currently it has grown organically and the names are inconsistent. We need to 
   >         standardize to the CF-1.4 names everywhere. This means some renaming in the future, but probably 
   >         no more.
   >         Once you have your server up, you can try to browse the data with http://webapps.datafed.net/datafed.aspx?wcs=http://your.domain.here:8080/NILU&coverage=EBAS Currently the browser registers it once and updates changes daily. So if you add new fields don't expect our browser to show it at once.
   >
   >         Good Luck, Kari
   >
   >         On 9/30/2010 10:13 AM, Paul Eckhardt wrote:
   >
   >             Hi Kari and Rudy,
   >
   >             thank you for taking the time on the phone and for offering to further
   >             assist us!
   >
   >             Attached you can find the complete DDL (ebas.sql) for creating an empty
   >             ebas database (this version is a bit outdated, but it should serve our
   >             purpose).
   >
   >             I created a very reduced version for testing that just contains what I
   >             think might be interesting for the WCS server. This can be found in
   >             ebas_reduced.sql
   >
   >
   >             Some basic information about the design:
   >
   >             DS_DATA_SET describes what we call a dataset in ebas: one parameter
   >             measured at a certain station (in the full version some more complex
   >             dependecies on instrument, method etc.).
   >             ER_REGIME_CODE is always 'IMG' for observations.
   >             EM_MATRIX_NAME defines which medium the parameter is measured in (e.g.
   >             precipitation, air, pm10, ...)
   >             EC_COMP_NAME is the name of the parameter (e.g. sulphate_total, ...)
   >             DS_STARTDATE and DS_ENDDATE provide the timestamp of the first and last
   >             measurement in the timeseries.
   >
   >             A1_TIME contains the measurements. Relates n:1 to DS_DATA_SETS,
   >             FK=DS_SET_KEY.
   >
   >             EB_STATION containd the location data. relates 1:n to DS_DATA_SETS.
   >
   >
   >             In case you have questions or need more or different information, please
   >             email to me and Aasmund.
   >
   >             Cheers,
   >
   >             Paul
   >
   >
   >
   >
   >
   >
   >     -- 
   >     Rudolf B. Husar, Professor and Director
   >     Center for Air Pollution Impact and Trend Analysis (CAPITA),
   >     Washington University,
   >     1 Brookings Drive, Box 1124
   >     St. Louis, MO 63130
   >     +1 314 935 6054

Next Heading

Hello

Next Heading

Hello

Next Heading

Hello

Next Heading

Hello

Next Heading

Hello

Next Heading

Hello

Next Heading

Hello

Next Heading

Hello

Next Heading

Hello

Next Heading

Hello