ESSI-LOD Python Framework

From Earth Science Information Partners (ESIP)
Revision as of 19:58, June 6, 2013 by Erozell89 (talk | contribs) (Created page with "== Overview == In a prior implementation of the ESSI-LOD linked data, we used a Java code base to scrape AGU HTML pages for abstracts. Having coordinated with AGU to produce...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Overview

In a prior implementation of the ESSI-LOD linked data, we used a Java code base to scrape AGU HTML pages for abstracts. Having coordinated with AGU to produce convert directly from bulk text files for meeting abstracts, we rewrote the implementation in Python. This wiki page has documentation on how to use (and extend) this Python framework.

Setting up the Framework

Dependencies

  • Python (version?)
  • Virtuoso (version?)

Sources

  • (Google Code?)
  • AGU Meeting Files (.txt)

Usage

Entity Classes

The Entity classes are for objects converted directly into RDF. They are typically instantiated either by a Stream class or by another Entity class.

Meeting

Session

Abstract

Section

Author

Convener

Keyword

Stream Classes

The Stream classes are utility classes for consuming bulk files containing entities. They are typically instantiated directly from a main program.

MeetingStream

SessionInfoStream

Lookup Classes

The lookup classes are used for entity conflation purposes. I.e., to consolidate URIs when the evidence provides for it.

OrganizationLookup