Semantic Web Tutorial

From Earth Science Information Partners (ESIP)

Hands on: semantic web application development from use case to user testing.

This 6 hr session presented over two half-days is intended to provide both the novice/ beginner/ manager and those already familiar with semantic web technologies in the context of an end-to-end application example. Material to be covered includes: Motivation and use case development. Use case analysis and model/ ontology development, technology and infrastructure options and methods for choosing. implementation and simple application development, user testing and discussion.

Sponsors: Semantic Web Cluster, with assistance from the Products & Services Committee and Information Technology & Interoperability Committee.


See - http://wiki.esipfed.org/index.php/Semantic_Web_Tutorials#Hands-on_tutorial_for_developing_a_semantic_web_application_using_the_ESIP_SW_testbed


Notes from the Session

Wednesday

Peter Fox

Semantic Web Development Cycle

  • Use case
  • Small team and mixed skills
  • Analysis
  • Develop model ontology
  • Use tools
  • Science reviews and iteration
  • Adopt Technology approach
  • Leverage technology Infrastructure
  • Rapid Prototype
  • Open world: evolve iterate etc.
  • All of which are evaluated and led back to use case.


Chris Lynnes: Use Cases

Use Case
Problem Statement: within ESIP it is difficult to know who is working on similar technologies and who is working with similar datasets. If we could find out this info, we could improve knowledge sharing and even new connections. Meanwhile, agency program managers sometimes need to know who is working on what data with what technologies.

Basic concept: develop a simple ontology of projects, people datasets, and tech Populate with ACCESS, MEaSUREs and similar project instance info, store triples in ESIP Testbed.

Sample queries:

  • Which projects are working with semantic web technology?
  • What tech are being used in NASA ACCESS?
  • Are any projects working with the same dataset?

Use Case Template:

  1. Use Case Identification
  2. Short Definitions
  3. Primary Actor
  4. Purpose
  5. Assumptions about use case
  6. Scenario: meat of use case take step by step
  7. Extensions
  8. Definition of success
  9. Notes/ issues

Use case 1 is Enter project data Populate the triple store with project info. Simple web interface, the actor is the project staff member. Purpose is to enter info into the triple store. Assume that the member of the staff knows enough about the project to enter the info.

Use Case 2 Produce Program Tech inventory Program manager gets a report on what techs are being use in his program. Program manager is the actor, gest a report of the tech usage for planning purposes, assumptions programs projects entered in triple store.

Use Case 3 Discovering dataset commonality Discover other projects working with the same datasets. Actor is project investigator. Purpose to discover other projects working on the same dataset, the triple store is used to hold the data and find the rest

Peter Fox

Elements of KR in Semantic web
Declarative knowledge
Statements as triple {subject-predicate-object}
Interferometer is-a optical instrument etc
A query: select all optical instruments which have operating mode vertical.
An inference: infer operating modes for a Fabry-Perot Interferometer which measures neutral temperature
Representing knowledge with objects
Take all individuals that we need to keep track of and place them into different buckets based on how similar they are to each other. Each bucket is given a descriptive based on what objects It contains.
Since the individuals in a given bucket are at least somewhat similar, we can avoid needing to describe every inconsequential detail about each individual. Assign properties that are common to all individuals.
Cannot have multiple inheritances in ontology. Can have several classes in the same but cannot inherit.
UML universal modeling language
Ontology definition metamodel/ meta object facility for UML
White board and text files are helpful for developing a knowledge representation-visual.
Interactive portion! Use Case Models maps or CMAPS
Peter used these CMAPS to get the crowd to share and question each other and work together and figure out what to do with the maps and a use case. He helped create an interesting discussion and showed the transition of preliminary ideas and slowly getting them to become more focused and less general.
First iteration, instead of implementing an interface as defined in the add record so that we can rapid prototype, we will generate a set of instances by hand. We’ll upload it and run some queries

Thursday

Peter Fox

  • All links for the notes and links from Wednesday’s meeting are at the Wiki then click Semantic Web and then tutorials.
  • Main part for the tutorial was using the CMAP ontology model v1a.
  • The main idea of this map was to answer use cases and map the ontology’s.
  • Swoop is an ontology editor sometimes better to work with than other editors and they are interoperable.

Application definition: means examine the use case, scope the first iteration, useful to think about metrics and record baselines.

Implementation basics: review your use case with team and experts, go into detail of your ontology and test it, look at the use case document and examine the actors, process flow, and artifcast, start to develop a design and an architecture, keep in mind it is more flexible to place the formal semantics within your interfaces between layers and components in your architecture or between users and info to mediate the exchange.

Actors: the initial analysis will often have many human actors, look to see were the humans can be replaced by machines, this will require additional semantics and knowledge encoding, if you are in a team ensure that actors know their role and what inputs, outputs, and preconditions are expected, often you may be able to run the use case before you build anything.

Process flow: each element in the process flow usually denotes a distinct stage in what will need to be implemented, often actors mediate thje process flow, use activity diagram as a means to turn the written process flow into a visual one.

Preconditions: often very syntactic and may not be ready to fit with semantically rich implementation, some level of modeling these preconditions may be required, beware of using other entities data and services

Artificats; Add artifacts that use the use case to resources list in the table, an artifact is any digital object added to the system and is recognized by the system, it is often useful to record what artifacts are critical, be thinking of provenance and the way these artifacts were produced, engage the actors to determine the names of these artifacts and who should have responsibility for them.

Review the resources: other than actors and artifacts what else do you need, your knowledge encoding is also a resource make it important, often a testbed with local dta is very useful at the start of implementation.

Knowledge encoding: Declaratives:OWL,RDFS etc, science expert review and iteration, have something to review,

Back to use case 1 of entering project data.

Foaf namespace needs to go in the header of v1 rdfs document.

Hook Hua: Querying the Ontology

RDF= resource description Framework: built on the triple subject predicate object

URI references can be written out completely in angle brackets

Short hand everything is done using XML qualified names without angle brackets

Using formal Uniform Resource Identifiers, identifying things using web identifiers

SPARQL: w3C standard query language for rdf querying at rdf level not owl

OWL Query Language: obsolete

SPARQUL: Peter Fox collaboration

Query Lang: commercial and embedded QL

SPARQL: Defined in terms of the W3cs RDF data model, hard to use OWL with this

Implementations: ARQ:sparql processor for JENA, Pellet: OWL DL Reasoner with some SPARQL.

SPARQL QUERIES: ? denotes a variable

Example:

SELECT ?project ?technology
Where
{
?project http://esipfed.orgworkswithtechnology ?technology
}

Testbed: work in progress, some artifacts only applicable to the current testbed.

Not recommended to do large triple stores, end up with tons of results.

Filtering Solutions / Regex(): FILTERS restrict solutions to those for which the filter expression evaluates true, sometimes do not know the exact resource names, use regex for partial matches

Solution modifiers: ORDER: put solutions in order, Projection: choose certain variables, Distinct, REDUCE, OFFSET, LIMIT: limit results

Order by sorts the results: sequence of order comparators is composed of an expression and an optional order modifier


DO NOT EXPOSE UNDERLYING QUERYING LANGUAGE TO USERS


DO NOT EXPECT USERS TO KNOW YOUR ONTOLOGY EXACTLY

RDFa representation to RDF

Cant run SPARQL query on RDFa

Suggestions: Tutorials, recorded tutorials , reoffer this idea sometime down the line. Needs to be accessible for everyone, little pieces of recordings to give people an idea of the vocab and speed necessary. Hopefully more hands on everyone could walk through the instance generation together.

EMAIL PETER FOX WITH IDEAS,