Difference between revisions of "Applications of Semantic Web for Earth Science"

From Earth Science Information Partners (ESIP)
Line 12: Line 12:
 
= Data Quality Screening Service  =
 
= Data Quality Screening Service  =
 
The Data Quality Screening Service (DQSS) is designed to help automate the filtering of remote sensing data on behalf of science users.  Whereas this process today involves much research through quality documents, followed by laborious coding, the DQSS acts as a Web Service to provide data users with data pre-filtered to their particular criteria, while at the same time guiding the user with filtering recommendations of the cognizant data experts.  Data that do not pass the criteria are replaced with fill values, resulting in a file that has the same structure and is usable in the same ways as the original (Fig. 1).
 
The Data Quality Screening Service (DQSS) is designed to help automate the filtering of remote sensing data on behalf of science users.  Whereas this process today involves much research through quality documents, followed by laborious coding, the DQSS acts as a Web Service to provide data users with data pre-filtered to their particular criteria, while at the same time guiding the user with filtering recommendations of the cognizant data experts.  Data that do not pass the criteria are replaced with fill values, resulting in a file that has the same structure and is usable in the same ways as the original (Fig. 1).
[[Image:dqss.jpg|frame|700px|Fig.1.  Data Quality Screening Service showing a data array before screening, the quality criteria mask used for screening and the data array after screening. The scene is for Total Precipitable Water over Hurricane Ike on 9 September 2008.  The figure on the left shows anomalously dry areas on the east side of the hurricane; however, these  turn out to be low quality retrievals (center) and thus are removed from the data array by the screening process.]]
+
[[Image:Dqss_ike.gif|frame|Fig.1.  Data Quality Screening Service showing a data array before screening, the quality criteria mask used for screening and the data array after screening. The scene is for Total Precipitable Water over Hurricane Ike on 9 September 2008.  The figure on the left shows anomalously dry areas on the east side of the hurricane; however, these  turn out to be low quality retrievals (center) and thus are removed from the data array by the screening process.]]
  
 
At the core of DQSS is an ontology that describes data fields, the quality fields for applying quality control and the interpretations of quality criteria.  This allows a generalized code base that can nonetheless handle both a variety of datasets and a variety of quality control schemes. Indeed, a data collection can be added to the DQSS simply by registering instances in the ontology if it follows a quality scheme that is already modeled in the ontology.  This will allow DQSS to scale to more data products with minimal cost.
 
At the core of DQSS is an ontology that describes data fields, the quality fields for applying quality control and the interpretations of quality criteria.  This allows a generalized code base that can nonetheless handle both a variety of datasets and a variety of quality control schemes. Indeed, a data collection can be added to the DQSS simply by registering instances in the ontology if it follows a quality scheme that is already modeled in the ontology.  This will allow DQSS to scale to more data products with minimal cost.

Revision as of 12:03, January 9, 2012

Introduction

Semantic web technology is becoming ever more important in Earth Science applications in a number of diverse roles. Furthermore, it is likely to become an even more important enabler as ambitious data science efforts, such as the Earth Cube initiative and ESIP's own Earth Science Collaboratory, more forward. These enterprises seek to make it easier to bring disparate datasets together as well as disparate disciplines and even communities in an effort to leverage our burgeoning data in the service of understanding the Earth as a system. As these various resources and the communities leveraging them diversify, the need for semantic technology to help users navigate the sea of resources becomes more apparent. Indeed, this role in discovery is acknowledged in the key capabilities determined through the first EarthCube Charrette.

However, we should not neglect the important role semantic technology can and does play in other aspects of data for Earth Sciences. For instance, semantic technology can be found in a key role in several other areas noted in the Earth Cube charrette capabilities:

  • Automated Quality Assurance and Quality Control
  • Provenance capture and interpretation
  • Workflow construction
  • Data fusion

Many such applications use underpinned by semantic technology, with the result that its value is not always readily apparent. In this short white paper, we discuss several ongoing or completed projects and applications that use semantic web as an underpinning in order to raise awareness of this critical technology.

Data Quality Screening Service

The Data Quality Screening Service (DQSS) is designed to help automate the filtering of remote sensing data on behalf of science users. Whereas this process today involves much research through quality documents, followed by laborious coding, the DQSS acts as a Web Service to provide data users with data pre-filtered to their particular criteria, while at the same time guiding the user with filtering recommendations of the cognizant data experts. Data that do not pass the criteria are replaced with fill values, resulting in a file that has the same structure and is usable in the same ways as the original (Fig. 1).

Fig.1. Data Quality Screening Service showing a data array before screening, the quality criteria mask used for screening and the data array after screening. The scene is for Total Precipitable Water over Hurricane Ike on 9 September 2008. The figure on the left shows anomalously dry areas on the east side of the hurricane; however, these turn out to be low quality retrievals (center) and thus are removed from the data array by the screening process.

At the core of DQSS is an ontology that describes data fields, the quality fields for applying quality control and the interpretations of quality criteria. This allows a generalized code base that can nonetheless handle both a variety of datasets and a variety of quality control schemes. Indeed, a data collection can be added to the DQSS simply by registering instances in the ontology if it follows a quality scheme that is already modeled in the ontology. This will allow DQSS to scale to more data products with minimal cost.