DataFed

From Earth Science Information Partners (ESIP)
Revision as of 00:26, January 23, 2008 by Rhusar (talk | contribs)

<Back to Data Summit Workspace <All Data Systems
Edit with Form or Submit Word Doc

General

Contact

Data System Name: DataFed
Data System URL: http://datafedwiki.wustl.edu
Contact Person: Rudy Husar
Contact e-mail: rhusar@me.wustl.edu

Background

About the Data System (Purposes, Audience)

DataFed is Web services-based software that non-intrusively mediates between autonomous, distributed data providers and users. DataFed is designed in accordance with the GEOSS architecture; It provides standard interfaces to heterogeneous distributed data, fosters data integration and use with processing web services and tools, and collects metadata and user-feedback on datasets. DataFed also provides standards-based data feeds to the NASA Giovanni System.

Presentation

Not Given

History

The federated data system, DataFed, was in development since 2001 at Washington University, CAPITA, with grants from NSF, NASA, EPA and RPOs. Since 2004, DataFed served both Regulatory and Policy support to EPA. Within CAPITA, DataFed has become a scientific data analysis tool.

Agencies

Washington University

List of Publications, Papers, Presentations

Data System Scope

Data Content

Datasets Served

Not Given

Parameters

DataFed provides access to over 50 distributed, air-quality relevant datasets (surface, satellite, and model) which can be explored and analyzed by tools for processing and visualization.

Spatial - Temporal Coverage

About half of the datasets are global scale, a third are US-scale, while some datasets are for other regions. Most datasets are multi-year in extent. About a ten datasets are near-real-time.

Applications/Potential


Health

No applications to health studies. However, the datasets mediated through DataFed are suitable for health studies, particularly in conjunction with the 1km-resolution US population data.

Forecasting and Reanalysis

No past application to F&R. A current NASA project with BAMS uses DataFed to assimilate surface observations.

Model/Emissions Evaluation

The EPA NEISGEI Project uses DataFed to integrate and to evaluate multiple emission databases. DataFed was used to prepare an evaluation of the CMAQ Aerosol Model. A current NASA project with BAMS uses DataFed to compare model simulations and emissions.

Characterization, Trends, Accountability

DataFed was used to perform aerosol characterization for the RPO project, FASTNET. DataFed is the main data source supporting the development of EPA's Exceptional Event Rule. It is now used in the implementation of the EE Rule.

Other

Since 2004, a major role of DataFed was to participate in interoperability experiments for GEOSS.

Data System IT

Primary/Official Store for Some data

Not Given

Data Consolidation/integration

Not Given

Providing Data Access to users/externals

DataFed is a homogenizer of distributed, heterogeneous datasets through data 'wrappers'. As a result all the data mediated in DataFed are accessible through international standard data access services, OGC WCS and WMS. At this time all data access services are free and offered through an open interface.

Data Processing

The processing of raw data is performed by reusable web-service components, which include filtering, aggregation, and data fusion services. Data processing applications are created by chaining services using workflow software.

Visualization/Analysis

The visualization tools for parameter-spatial-temporal browsing are applicable for each dataset in the federated data system. The output data from the processing services are also available for mashups with other popular tools such as Google Earth and GIS software.

Decision Support (e.g. some integration into user business process)

Not Given

End-to-End Integration

Not Given

Other DS Values

Not Given

Data Access and/or Output Interoperability

Both the raw input data as well as the processed outputs are accessible through international standard interfaces. This allows the creation of loosely-coupled network applications.

Reusable Tools and Methods

The data access, processing and visualization services in DataFed are all composed of reusable Web Services through both SOAP and REST protocols.

Security Barriers and Solutions

The data access and processing services are accessible through the SOAP-WSDL protocol, which is designed to pass through firewalls. At this time there are no access restrictions to these services.

User Feedback Approach

Not Given

Other Architecture

The DataFed architecture has been used as a model for demonstrating the "System of Systems" aspect of GEOSS.

User Provided Content