Recommendations for Semantic Web Markup of Existing XML
Overview
The Air Quality Cluster would benefit from advice from the Semantic Web Cluster on how to add semantic markup to the XML-based OGC WMS Capabilities document to identify dataset name, type of data, domain, etc., in order to support a faceted search.
Use Case: Semantic Markup of WMS Capabilities Documents
The Air Quality Cluster is experimenting with using some kind of structured markup / tagging of OGC WMS and WCS capabilities documents (inside <Keyword> elements) to allow us to do structured searches on the documents. An example might be, "give me the layers where Dataset = 'OMI_AI_G'". See WMS_GetCapabilities#WMS_GetCapabilities_Layer_Description
However, if we are going to try to implement this kind of markup with a quasi-controlled vocabulary, we should do it in such a way that it is compatible with or even leverages the semantic web. A machine tags approach has been considered, e.g.
<Keyword>esip:dataset=OMI_AI_G</Keyword>
A link to an initial attempt of a WMS that includes the current keyword encoding: [1]. In the actual use case, this is used in a faceted search engine. Clearly, if this started out as a form of RDF, it would already be amenable to faceted search
An alternative using XLink has been proposed as well. (XlinkMarkupExample)
Alternatively, RDFa was considered, but it is mostly defined in the context of XHTML.
Can the ESIP Semantic Web cluster provide a recommendation or suggestion in how to move forward that would be:
- flexible and extensible,
- compatible with the evolving ESIP datatype and services ontology and
- lightweight and easy to use
For any proposed solution, it would be extremely helpful to provide:
- an example of implementation, based on the current case at [2]
- an assessment of the scheme's usability by semantic web newbies
- pointers to existing tools that can work with the proposed solution, if they exist
- for bonus points, can the scheme be chained? That is, if I have
<Keyword>Platform:Satellite</Keyword>
can I also have something like<Keyword>Satellite:Aura</Keyword>
- for even more bonus points, can the scheme be extended to, say, extend the OpenSearch Atom response to return structured metadata?
Resources
http://gcmd.nasa.gov/Resources/valids/archives/keyword_list.html
http://www.w3.org/2005/Incubator/ssn/wiki/Semantic_Mark_up -- Work in W3C SSN addressing this topic