Difference between revisions of "Strategic Vision"

From Earth Science Information Partners (ESIP)
Line 8: Line 8:
 
== Creating and Inventory of Existing Projects ==  
 
== Creating and Inventory of Existing Projects ==  
  
===Current Semantic Web Projects to which Cluster Members are Contributing===
+
=== Projects to which Cluster members are contributing ===
 
* Global Change Master Directory (GCMD) Ontology - A GCMD Platform-Instrument-Sensor ontology that utilizes existing GCMD keyword hierarchies and SKOS concepts, as of 8/2015 the ontology was under review by NASA and not publicly available
 
* Global Change Master Directory (GCMD) Ontology - A GCMD Platform-Instrument-Sensor ontology that utilizes existing GCMD keyword hierarchies and SKOS concepts, as of 8/2015 the ontology was under review by NASA and not publicly available
 
* [ https://data.globalchange.gov/ Global Change Information System ] - from the GCIS website "The GCIS is an open-source, web-based resource for traceable, sound global change data, information, and products. Designed for use by scientists, decision makers, and the public, the GCIS provides coordinated links to a select group of information products produced, maintained, and disseminated by government agencies and organizations. As well as guiding users to global change research products selected by the 13 member agencies, the GCIS serves as a key access point to assessments, reports, and tools produced by the USGCRP. The GCIS is managed, integrated, and curated by USGCRP."
 
* [ https://data.globalchange.gov/ Global Change Information System ] - from the GCIS website "The GCIS is an open-source, web-based resource for traceable, sound global change data, information, and products. Designed for use by scientists, decision makers, and the public, the GCIS provides coordinated links to a select group of information products produced, maintained, and disseminated by government agencies and organizations. As well as guiding users to global change research products selected by the 13 member agencies, the GCIS serves as a key access point to assessments, reports, and tools produced by the USGCRP. The GCIS is managed, integrated, and curated by USGCRP."
 
* [ http://semanticportal.esipfed.org ESIP Ontology Portal ] - an ESIP hosted ontology portal, based off of BioPortal, to house and align Earth science ontologies
 
* [ http://semanticportal.esipfed.org ESIP Ontology Portal ] - an ESIP hosted ontology portal, based off of BioPortal, to house and align Earth science ontologies
 
* [ http://toolmatch.esipfed.org/index ToolMatch ] - a semantic-based system for matching data to software tools and answering use cases such “I have data and need to know which tools I can use”, with an example being “I just downloaded an AIR Level 2 Standard retrieval file. How can I look at it?”. In addition to the project homepage there is also the [ http://github.com/ESIPFed/Toolmatch ToolMatch GitHub Repository ]
 
* [ http://toolmatch.esipfed.org/index ToolMatch ] - a semantic-based system for matching data to software tools and answering use cases such “I have data and need to know which tools I can use”, with an example being “I just downloaded an AIR Level 2 Standard retrieval file. How can I look at it?”. In addition to the project homepage there is also the [ http://github.com/ESIPFed/Toolmatch ToolMatch GitHub Repository ]
 +
* [ http://www.geolink.org GeoLink ] - The GeoLink project brings together experts from the geosciences, computer science, and library science in an effort to develop Semantic Web components that support discovery and reuse of data and knowledge. GeoLink's participating repositories include content from field expeditions, laboratory analyses, journal publications, conference presentations, theses/reports, and funding awards that span scientific studies from marine geology to marine ecosystems and biogeochemistry to paleoclimatology.
 +
 
== The Linked Science Cloud ==
 
== The Linked Science Cloud ==
  
 +
He suggests having a look at today’s Linked Geoscience Data Cloud, to see breadth of what is being represented. Search engines taking notice. New improvements for datasets e.g. time and space in schema.org. Also more use of PROV for provenance. GCIS is epistemology (more than just a knowledge base) (https://en.wikipedia.org/wiki/Epistemology). In last 5 years, he’s learned not to go straight to encoding of an ontology…but rather to start with conceptual and information models to have a diversity of choices of how to encode. He has current interest in semiotics, study of signs (https://en.wikipedia.org/wiki/Semiotics). Syntax, semantics, pragmatics. Pragmatics for use. Keeping the human in the loop. Cognitive science aspect. Time to return to process ontologies (not just things, but also processes).
 +
 +
=== Action Items ===
 +
# Identify existing geoscience Linked Data sets which could be part of the Linked Science Cloud
 +
# Identify potential links between these datasets
 +
 +
== Long Term Cyberinfrastructure ==
  
 +
At present, there is a known challenge in transitioning from prototype applications to production quality cyberinfrastructure. There is often difficulty in identifying long term hosting and maintenance for our Semantic Web projects. To be fair, this is a common issue among most grant funded information technology projects. The Semantic Web Cluster, and ESIP in general, are committed to identifying new and promising solutions. We are continually working with [ Products_and_Services | ESIP Products and Services ] to explore possibilities.
  
He suggests having a look at today’s Linked Geoscience Data Cloud, to see breadth of what is being represented. Search engines taking notice. New improvements for datasets e.g. time and space in schema.org. Also more use of PROV for provenance. GCIS is epistemology (more than just a knowledge base) (https://en.wikipedia.org/wiki/Epistemology). In last 5 years, he’s learned not to go straight to encoding of an ontology…but rather to start with conceptual and information models to have a diversity of choices of how to encode. He has current interest in semiotics, study of signs (https://en.wikipedia.org/wiki/Semiotics). Syntax, semantics, pragmatics. Pragmatics for use. Keeping the human in the loop. Cognitive science aspect. Time to return to process ontologies (not just things, but also processes).
+
We have also identified a need to encourage real-time sharing of progress from Testbed projects with the broader ESIP community. The Semantic Web Cluster is exploring the use of Semantic Technologies in this area and will serve as a willing test bed for further development.
  
 
-
 
-
Line 24: Line 34:
  
 
== Ontology Governance ==
 
== Ontology Governance ==
The Open Biological and Biomedical Ontologies http://www.obofoundry.org/crit.shtml
+
The Cluster will adopt some form of the [ http://www.obofoundry.org/crit.shtml The Open Biological and Biomedical Ontologies] principals for ontology governance. Specifically how this is to be accomplished is still open for debate?
 +
 
 +
=== Open Questions ===
 +
# Are we following all of the [http://wiki.obofoundry.org/wiki/index.php/OBO_Foundry_Principles_2008 OBO Foundry principals?] If no, then which subset will the Cluster abide by?

Revision as of 12:36, August 11, 2015

Introduction

The ESIP Semantic Web cluster is approaching 10 years old and is one of the oldest clusters within ESIP. During this decade of existence the tide has shifted to federal agencies such as NASA, DOE, and NSF CISE fully embracing semantic technologies. The amount of Linked Data is increasing at a staggering rate. There are new improvements for dataset descriptions (e.g. time and space) within commercial efforts such as schema.org. Semantic Web Technologies are now a mainstay in the geoscience information and data management community.

The time has come to move beyond prototypes and proofs of concept. To this end, the Semantic Web Cluster is developing a Strategic Vision and Road Map for the next 3 to 5 years. We aim to create a living document that will synthesize existing semantic efforts and guide future research and development. We want to coalesce our broad knowledge base and develop towards a common long term cyberinfrastructure. We would like to continue the tradition of the geosciences being an early adopter and feedback loop for the broader Semantic Web community.

To achieve these goals the Cluster will focus its efforts around

Creating and Inventory of Existing Projects

Projects to which Cluster members are contributing

  • Global Change Master Directory (GCMD) Ontology - A GCMD Platform-Instrument-Sensor ontology that utilizes existing GCMD keyword hierarchies and SKOS concepts, as of 8/2015 the ontology was under review by NASA and not publicly available
  • [ https://data.globalchange.gov/ Global Change Information System ] - from the GCIS website "The GCIS is an open-source, web-based resource for traceable, sound global change data, information, and products. Designed for use by scientists, decision makers, and the public, the GCIS provides coordinated links to a select group of information products produced, maintained, and disseminated by government agencies and organizations. As well as guiding users to global change research products selected by the 13 member agencies, the GCIS serves as a key access point to assessments, reports, and tools produced by the USGCRP. The GCIS is managed, integrated, and curated by USGCRP."
  • [ http://semanticportal.esipfed.org ESIP Ontology Portal ] - an ESIP hosted ontology portal, based off of BioPortal, to house and align Earth science ontologies
  • [ http://toolmatch.esipfed.org/index ToolMatch ] - a semantic-based system for matching data to software tools and answering use cases such “I have data and need to know which tools I can use”, with an example being “I just downloaded an AIR Level 2 Standard retrieval file. How can I look at it?”. In addition to the project homepage there is also the [ http://github.com/ESIPFed/Toolmatch ToolMatch GitHub Repository ]
  • [ http://www.geolink.org GeoLink ] - The GeoLink project brings together experts from the geosciences, computer science, and library science in an effort to develop Semantic Web components that support discovery and reuse of data and knowledge. GeoLink's participating repositories include content from field expeditions, laboratory analyses, journal publications, conference presentations, theses/reports, and funding awards that span scientific studies from marine geology to marine ecosystems and biogeochemistry to paleoclimatology.

The Linked Science Cloud

He suggests having a look at today’s Linked Geoscience Data Cloud, to see breadth of what is being represented. Search engines taking notice. New improvements for datasets e.g. time and space in schema.org. Also more use of PROV for provenance. GCIS is epistemology (more than just a knowledge base) (https://en.wikipedia.org/wiki/Epistemology). In last 5 years, he’s learned not to go straight to encoding of an ontology…but rather to start with conceptual and information models to have a diversity of choices of how to encode. He has current interest in semiotics, study of signs (https://en.wikipedia.org/wiki/Semiotics). Syntax, semantics, pragmatics. Pragmatics for use. Keeping the human in the loop. Cognitive science aspect. Time to return to process ontologies (not just things, but also processes).

Action Items

  1. Identify existing geoscience Linked Data sets which could be part of the Linked Science Cloud
  2. Identify potential links between these datasets

Long Term Cyberinfrastructure

At present, there is a known challenge in transitioning from prototype applications to production quality cyberinfrastructure. There is often difficulty in identifying long term hosting and maintenance for our Semantic Web projects. To be fair, this is a common issue among most grant funded information technology projects. The Semantic Web Cluster, and ESIP in general, are committed to identifying new and promising solutions. We are continually working with [ Products_and_Services | ESIP Products and Services ] to explore possibilities.

We have also identified a need to encourage real-time sharing of progress from Testbed projects with the broader ESIP community. The Semantic Web Cluster is exploring the use of Semantic Technologies in this area and will serve as a willing test bed for further development.

- - knowledge store built with Data Collection form and Tool form; current matching capability: given a data server and a data format, find a tool or data collection measureing uptake and use of ontologies

Ontology Governance

The Cluster will adopt some form of the [ http://www.obofoundry.org/crit.shtml The Open Biological and Biomedical Ontologies] principals for ontology governance. Specifically how this is to be accomplished is still open for debate?

Open Questions

  1. Are we following all of the OBO Foundry principals? If no, then which subset will the Cluster abide by?