Provider Abbreviation: DataFed
Provider URL: http://datafed.net
Location:: 38.65, -90.30
Networks: HTAP, GEOSS, ACPortal
About the Data Provider (Purposes, Audience):
About the Data System (Purposes, Audience):
DataFed is Web services-based software that non-intrusively mediates between autonomous, distributed data providers and users. DataFed is designed in accordance with the GEOSS architecture; It provides standard interfaces to heterogeneous distributed data, fosters data integration and use with processing web services and tools, and collects metadata and user-feedback on datasets. DataFed also provides standards-based data feeds to the NASA Giovanni System.
[[DataSystemHistory::The federated data system, DataFed, was in development since 2001 at Washington University, CAPITA, with grants from NSF, NASA, EPA and RPOs. Since 2004, DataFed served both Regulatory and Policy support to EPA. Within CAPITA, DataFed has become a scientific data analysis tool.]]
List of Publications, Papers, Presentations:
[[DataSystemRef::FASTNET,EPA Exceptional Event Project, Interoperability of Web Service-Based Data Access and Processing... ESTO 2006, Combined Aerosol Trajectory Tool, CATT, DataFed: Mediated Web Services for Distributed AQ Data Access and Processing IGARS 2007, Interoperable Info System of Systems for HTAP HTAP 2007]]
Data System Scope
[[DataSystemDataSets::See Dataset Catalog]]
[[DataSystemParam::DataFed provides access to over 100+ distributed, air-quality relevant datasets (surface, satellite, and model) which can be explored and analyzed by tools for processing and visualization.]]
Spatial - Temporal Coverage:
About half of the datasets are global scale, a third are US-scale, while some datasets are for other regions. Most datasets are multi-year in extent. About a ten datasets are near-real-time.
No applications to health studies. However, the datasets mediated through DataFed are suitable for health studies, particularly in conjunction with the 1km-resolution US population data.
Forecasting and Reanalysis:
[[DataSystemAppFcstReAnaly::A current NASA project with BAMS uses DataFed to assimilate surface obs. into a forecast model. We are not aware of any formal Air Quality Reanalysis effort; hopefully, thee community will]]
[[DataSystemAppModelEval::The EPA NEISGEI Project uses DataFed to integrate and to evaluate multiple emission databases. DataFed was used to prepare an evaluation of the CMAQ Aerosol Model with IMPROVE and FRM data. In the NASA project with BAMS, DataFed provides surface observations for assimilation into a forecast model. Add HTAP data integration...]]
Characterization, Trends, Accountability:
[[DataSystemAppCharact::DataFed was used to perform aerosol characterization for the RPO project, FASTNET. DataFed is the main data source supporting the development of EPA's Exceptional Event Rule. It is now used in the implementation of the EE Rule.]]
[[DataSystemAppOther::Since 2004, a major role of DataFed was to participate in interoperability experiments for GEOSS.]]
Data System IT
Primary/Official Store for Some data:
DataFed is a mediator of data flow between providers and users. It does not primary/official data.
Data consolidation from heterogeneous to homogeneous structure is performed on the fly for most datasets. Many historical datasets are cached at DataFed for fast data access and browsing.
Providing Data Access to users/externals:
[[DataSystemValueAccess::DataFed is a homogenizer of distributed, heterogeneous datasets through data 'wrappers'. As a result all the data mediated in DataFed are accessible through international standard data access services, OGC WCS and WMS. At this time all data access services are free and offered through an open interface.]]
The processing of raw data is performed by reusable web-service components, which include filtering, aggregation, and data fusion services. Data processing applications are created by chaining services using workflow software.
[[DataSystemValueVis::The visualization tools for parameter-spatial-temporal browsing are applicable for each dataset in the federated data system. The output data from the processing services are also available for mashups with other popular tools e.g. Google Earth and GIS software.]]
Decision Support (e.g. some integration into user business process):
[[DataSystemValueDecisionSupport::DataFed has served the RPOs through the FASTNET and Combined Aerosol Trajectory Tool (CATT) projects. More recently DataFed supports the decisions for the Exceptional Event Rule for PM2.5 and ozone.]]
[[EndtoEndIntegration::Data access, processing and visualization are all performed within DataFed. Specific workflow configurations are created from the loosely coupled web services for different scientific analysis or decision-support applications. An example custom workflow is the Combined Aerosol Trajectory Tool (CATT).]]
Other DS Values:
Data Access and/or Output Interoperability:
[[DataSystemArchInterop::Both the raw input data as well as the processed outputs are accessible through international standard interfaces. This allows the creation of loosely-coupled network applications (Paper-PDF).]]
Reusable Tools and Methods:
[[DataSystemArchToolsMethods::The data access, processing and visualization services in DataFed are all composed of reusable Web Services through both SOAP and REST protocols (Paper-PDF).]]
Security Barriers and Solutions:
The data access and processing services are accessible through the SOAP-WSDL protocol, which is designed to pass through firewalls. At this time there are no access restrictions to these services.
User Feedback Approach:
[[DataSystemArchUserFeedbck::For each dataset registered in DataFed there is a "DataSpace" wiki page for the collection of dataset-relavent information, including user feedback (e.g. AirNOW).]]
[[DataSystemArchOther::The DataFed architecture has been used as a model for demonstrating the "System of Systems" aspect of GEOSS.]]