Difference between revisions of "WCS Wrapper Configuration"

From Earth Science Information Partners (ESIP)
Line 171: Line 171:
 
== Configuring the WCS using SQL Views ==
 
== Configuring the WCS using SQL Views ==
  
This is demostrated in the test provider '''point'''
+
This is demonstrated in the test provider '''point'''.
 +
 
 +
The demonstration is using sqlite, which is distributed with python by default. The project has following files:
 +
 
 +
* '''pntdata.py''': This script creates the test database and fills it with dummy data.
 +
* '''pntdata.db''': The sqlite database file created by pntdata.py
 +
* '''point_config.py''': The requiored view names

Revision as of 15:39, July 22, 2010

Back to WCS Wrapper

Project on SourceForge

Questions and comments should go to sourceforge discussions, bug reports to sourceforge tickets. Urgent issues can be asked from Kari Hoijarvi 314-935-6099(w) or 314-843-6436(h)

Last updated 2010-07-21

Structure of OWS/web

OWS/web is for system developers only.

OWS/web/static contains static web content. You can put any documentation here and it will be served as a web page or download. The home page index.html is pretty much mandatory, and you shoud change favicon.ico to reflect your organization. We highly recommend, that you customize these to document your WCS service.

OWS/web/static/cache is a folder for temporary files. The service uses it for output files. Anything you put there will be deleted when space is needed.

The installation contains an example datasets OWS/web/static/testprovider and OWS/web/static/point. The testprovider is a demo NetCDF dataset, point is an example how to server point data from a SQL database. Every service will have a folder with the same name here.

You may now check the provider page http://localhost:8080/testprovider which is served as a static file. Any file under static becomes accessible

The Human Interface: Create the index.html Front Pages for Visitors.

If no query is present, the server gives a default page index.html. You should provide pages for your server and for all the providers.

The server index.html is at OWS/web/static/index.html, which will be displayed from url http://localhost:8080/, Index of an external server index.html is here.

Every provider folder should also have an index.html like OWS/web/static/testprovider\index.html which will be displayed from http://localhost:8080/testprovider, index of an external provider front page is here

Every provider should have wcs_capabilities.conf that lists keywords and contact information. The format is simple, copy one from the testprovider and edit it.

Windows Implementation Bug

Important There is a bug deep in python core libraries that make serving text files tricky. The files need to be encoded with unix style line ending convention '\n', instead of windows style '\r\n'.

To fix this, issue command:

   python /OWS/web/owsadmin.py unix_nl filename.html

for every html file you serve.

Serving data from periodic collection of NetCDF files

Sometimes you have accumulated a huge number of small NetCDF files, like daily slices from a model output. You could combine those into one big cube, but you for a terabyte of files, that may not be an option.

Download our HTAP test package custom-netcdf-1.2.0.zip. It only has two days of data to make download small. Then read the custom provider page

Storing Point Data in a Relational Database

Provider point is an example how to configure this service to use SQL database to serve point data.

Point data is often stored in SQL databases. There's no standard schema like CF-1.0 convention for NetCDF files, so it is not possible to just connect and start serving. You have to create the configuration description.

Therefore, the WCS query processor needs to know what to select and join. This information must be edited into the configuration script.

Notes on SQL

One of the most powerful ideas in relational database design is the concept of a view. You don't need to change the existing data tables, creating a view that makes your DB to look like the one needed is usually enough. This is by far the easiest way to configure your WCS.

It is better to design a normalized schema and only optimize with benchmarks available. Especially filtering small lat/lon ranges is much more efficient to do on a normalized location table rather than denormalized data table.

Location Table

The common thing between different databases is, that they need to have a location table.

   table location
   +----------+-------------------------+
   | loc_code | lat   | lon     | elev  |
   +----------+-------------------------+
   | KMOD     | 37.63 | -120.95 |  30.0 |
   | KSTL     | 38.75 |  -90.37 | 172.0 |
   | KUGN     | 42.42 |  -87.87 | 222.0 |
   |...       |       |         |       |
   +----------+-------------------------+

Here loc_code is the primary key and lat,lon is the location. Optional fields can be added. CIRA VIEWS database has a location table, but it's called Site and it spells full longitude. The datafed browser uses standard names loc_code, loc_name, lat and lon for browsing. For plug-and-play compatibility we recommend using these names. In the CIRA VIEWS database, the view creation would be:

   create view location as 
   select 
       SiteCode as loc_code, 
       Latitude as lat, 
       Longitude as lon 
   from Site

The primary key is loc_code, being unique for all the locations.

If the fields have different names they can be aliased in the configuration.

Some Different DB Schema types

In this documentation three different schemas are presented. Each of them have good and bad points.

One Big Data Table

In this case, all the data is in the same table:

   +----------+------------+------+------+------+
   | loc_code | datetime   | TEMP | DEWP | VIS  |
   +----------+------------+------+------+------+
   | KMOD     | 2009-06-01 | 87.8 | 51.4 | 10   |
   | KMOD     | 2009-06-02 | 82.3 | 51.4 | NULL |
   | KSTL     | 2009-06-01 | 78.6 | 34.9 | 18   |
   | ...      |            |      |      |      |
   +----------+------------+------+------+------+

The foreign key to location table is loc_code. The primary key is (loc_code, datetime)

Strengths: Simple, No joining when querying all the fields.

Downsides: Needs nulls for missing data, querying just one field is inefficient.

Long And Skinny Table

In this case, all the data is in the same table:

   +----------+------------+------+-------+
   | loc_code | datetime   | data | param |
   +----------+------------+------+-------+
   | KMOD     | 2009-06-01 | 87.8 | TEMP  |
   | KMOD     | 2009-06-02 | 82.3 | TEMP  |
   | KSTL     | 2009-06-01 | 78.6 | TEMP  |
   | KMOD     | 2009-06-01 | 51.4 | DEWP  |
   | KMOD     | 2009-06-02 | 51.4 | DEWP  |
   | KSTL     | 2009-06-01 | 34.9 | DEWP  |
   | KMOD     | 2009-06-01 | 10   | VIS   |
   | KMOD     | 2009-06-02 | 10   | VIS   |
   | KSTL     | 2009-06-01 | 18   | VIS   |
   | ...      |            |      |       |
   +----------+------------+------+------+

Strengths: No nulls, Easy to add fields.

Downsides: Querying requires extra filtering with parameter index, slower than others.

One Data Table For Each Param

Each parameter has its own data table. In this case there's no need for nulls, and is the fastest for one parameter query.

   +----------+------------+------+
   | loc_code | datetime   | TEMP |
   +----------+------------+------+
   | KMOD     | 2009-06-01 | 87.8 |
   | KMOD     | 2009-06-02 | 82.3 |
   | KSTL     | 2009-06-01 | 78.6 |
   | ...      |            |      |
   +----------+------------+------+


   +----------+------------+------+
   | loc_code | datetime   | DEWP |
   +----------+------------+------+
   | KMOD     | 2009-06-01 | 51.4 |
   | KMOD     | 2009-06-02 | 51.4 |
   | KSTL     | 2009-06-01 | 34.9 |
   | ...      |            |      |
   +----------+------------+------+


   +----------+------------+-----+
   | loc_code | datetime   | VIS |
   +----------+------------+-----+
   | KMOD     | 2009-06-01 | 10  |
   | KMOD     | 2009-06-02 | 10  |
   | KSTL     | 2009-06-01 | 18  |
   | ...      |            |     |
   +----------+------------+-----+


Strengths: No nulls, Easy to add tables, easy to add heterogenous flag fields, fastest queries for single parameter.

Downsides: More tables, querying all the parameters at once requires a massive join.

Configuring the WCS using SQL Views

This is demonstrated in the test provider point.

The demonstration is using sqlite, which is distributed with python by default. The project has following files:

  • pntdata.py: This script creates the test database and fills it with dummy data.
  • pntdata.db: The sqlite database file created by pntdata.py
  • point_config.py: The requiored view names