DataFed Data Access Services

From Earth Science Information Partners (ESIP)

Data Access Protocols and Adapters

(For a broader description of the DataFed interoperability approach see Interoperability of Web Service-Based Data Access and Processing)

The rich structure and semantics of Earth Science data means that any given dataset can be accessed through multiple protocols. In general, each client and server is capable of communicating through a subset of protocols. Thus, loose coupling between data access and processing services involves choices and negotiations. The main topics of client-server negotiation are the selection of a shared data access protocol and a choice of returned data format.

An example of a flexible data access interface is shown in Fig.3. It represents the data access module in the federated data system, DataFed.

Fig.3a.The electric adapter is a good analogue of the DataFed software adapters.
Fig.3b. Data Access Protocols and Adapters

Individual, heterogeneous, distributed datasets are connected to the data system through wrappers, which homogenize the access. On the client side, the interface to the data system is through adapters that provide access through multiple protocols and data formats. The specific data access protocols offered through DataFed are shown in Table 1 for nine representative datasets.


Catalog Display/Discuss Data Description Spatial Data Access
Dataset Registration Viewer Discuss Sensor Type Data Type Data Access WCS WFS WMS Url SOAP
AIRNOW XML - Form View Wiki In Situ Point Protocols X X X X X
SURF_MET XML - Form View Wiki In Situ Point Protocols X X X X X
VIEWS_OL XML - Form View Wiki In Situ Point Protocols X X X X X
THREDDS_CDM XML - Form View Wiki Model Grid Protocols X   X X X
THREDDS_GFS XML - Form View Wiki Model Grid Protocols X   X X X
NCDC_AVG_WIND XML - Form View Wiki Model Grid Protocols X   X X X
CIESIN XML - Form View Wiki Model SeqImage Protocols X   X X X
OnEarth_JPL XML - Form View Wiki RemSens SeqImage Protocols X   X X X
SEAW_US XML - Form View Wiki RemSens SeqImage Protocols X   X X X

Dataset. This column shows the names of the datasets selected for this demonstration. Each dataset has a unique name which can be used to access any dataset for browsing etc. The full list of the available data is in the DataFed catalog.

Liping: This dataset selection includes both observational (surface and satellite) as well as model data. This subset of available data could be useful for the July Dever GSN demo at IGARS 06. As you see from the table all these datasets are available through WCS and WMS as well as through their respective SOAP interfaces. I hope you can make use of these in the demo. We can demonstrate the various processing and overlay services through our DataFed client and server.Rhusar 01:05, 28 May 2006 (EDT)

Registration. A dataset registration (rendered as XML and form) contains all the relevant information that is needed to find, and to access (bind) a dataset for purposes of web service chaining within the DataFed system. However, this is not a formal protocol-based registration.

Liping: If you will have a catalog service we would be happy to register these services in the catalog for the demo. We could also consider using Earth-Sun System Gateway the same way as we used it for the Beijing GSN Demo.Rhusar 01:05, 28 May 2006 (EDT)

Viewer. Each dataset in DataFed can be accessed through a single generic viewer, which allows browsing through the multi-dimensional dataset and editing the service flow for each data layer. The viewer also provides access to the settings of each web service through the service flow diagram.

Chainers: The viewer is a good way to see what the dataset looks like it would be good if other chainers could show specific data access calls that produce some sort of output.Rhusar 01:06, 28 May 2006 (EDT)

Discuss. Each dataset has a wiki page that can be modified by any user. Initially, the page only contains a brief description of the dataset along with a link to the viewer. The "talk" pages (accessible through discussion tab) are suitable for threaded discussion.

Chainers: Dataset specific comments can be placed either on the main dataset description page or else on the associated discussion page.Rhusar 01:05, 28 May 2006 (EDT)

Sensor Type. This field identifies the nature of the "sensor" that is used to produce the dataset. In these examples in situ refers to point monitoring of air quality through surface sensors, model represents output from numerical simulation models and RemSens is typically from satellite sensors.

Data Types Used in DataFed. The output from sensors is structured into different data types. The three main data types in DataFed are point, grid, and seqimage, shown pictorially in Fig.4.

Fig. 4. Data Types used in DataFed

Point data arise from monitoring sites at fixed geographic points. Typically these data have time series of multiple parameters at each station. Grid data arise typically from model simulations that have regular spacing and one of the standard coordinate systems (projections). Model grids are typically multi-dimensional, covering X,Y,Z, and T as well as parameter dimensions. SeqImage is a data type for time-sequenced georeferenced images such as satellite and radar images that are produced in fixed time intervals (hourly, daily). Sequential images are typically spatial, but they also vary in time. There are numerous other data types used in air quality that are not shown here including trajectory, multi-spectral satellite image etc.

Chainers: Note that each data type has specific set of access protocols and returned data formats. We are pursuing netCDF-CF grid or netCDF-table for point data.Rhusar 01:05, 28 May 2006 (EDT)

Data Access. The data access protocols that are available for any given dataset are listed in a special form illustrated in Fig.5.

Typical Form Facilitating Access to Multiple Data Access Services

Chainers: These forms for data access services are generated automatically by the adapter module(device driver) that can transform our internal data formats into any other format. We can talk about these at the next Chainers telecon. Rhusar 01:05, 28 May 2006 (EDT)

WCS, WFS and WMS are OGC protocols for Coverage, Features and Maps respectively. Each OGC service has an associated getCapabilities document, which lists the offerings. The data are also accessible through a DataFed-specific cgi interface using key-value-pairs, similar to the W*S OGC REST (URL) interfaces. Finally, the SOAP interface is offered to access data, through formal SOAP-based web services. The strong typing of this interface is assured by the WSDL for each service, which in turn is defined by a formal XML schema. The output formats for each data type and access protocol are listed in a separate table