Linked Open Research Data for Earth & Space Science Informatics

From Earth Science Information Partners (ESIP)

Co-Leads: Eric Rozell and Tom Narock
Earth and Space Science Informatics (ESSI) is inherently multi-disciplinary, requiring close collaborations between scientists and information technologists.  Identifying potential collaborations can be difficult, especially with the rapidly changing landscape of technologies and informatics projects.  The ability to discover the technical competencies of other researchers in the community can help in the discovery of collaborations.
We present two solutions towards this problem: a pipeline for generating structured data from ESSI abstracts and an API and Web application for accessing the generated data.  We use a Natural Language Processing technique, Named Entity Disambiguation, to extract information about researchers, their affiliations, and technologies they have applied in their research.  We encode the extracted data in the Resource Description Framework, using Linked Data vocabularies including the Semantic Web for Research Communities ontology and the Friend-of-a-Friend ontology.  Lastly, we expose this data in three ways: through a SPARQL endpoint, through Java and PHP APIs, and through a Web application. In addition to collaboration discovery, this data can be used to analyze trends in the field, which will help project managers identify irrelevant, well-established, and emerging technologies and specifications.  This information will help keep projects focused on the technologies and standards that are actually being used, making them more useful to the ESSI community.  Our implementations are open source, and we expect that the pipeline and APIs can evolve with the community.