Documenting Online Resources

From Earth Science Information Partners (ESIP)
Revision as of 15:18, September 7, 2014 by Ted.Habermann (talk | contribs)

As the World Wide Web has developed into a ubiquitous information source, links to on-line information and services have become critical elements in all metadata dialects. Some dialects emerged during the early days of the web when less was known about how it would develop and flourish. URLs were simple and self-explanatory and it was enough to include just the bare URL in the metadata. As URLs have increased in complexity, it has become more important to provide supporting information along with the links.

NASA GCMD Directory Interchange Format

URL's are described in DIF using the Related_URL field. They have the following properties:

<dif:Related_URL uuid="UUID">
    <dif:URL_Content_Type/>
    <dif:URL/>
    <dif:Description/>
</dif:Related_URL>

The URL_Content_Type field comes from the URL Content Type Keywords which provide standard names for a number of data systems and services. Data access URLs can be recognized by the URL_Content_Type = "GET DATA". The Related_URL field is highly recommended and may be repeated.

ECHO

The ECHO model includes several types of URLs, each with a unique set of properties:

<OnlineAccessURL>
    <URL/>
    <URLDescription/>
    <MimeType/>
</OnlineAccessURL>

<OnlineResource>
    <URL/>
    <Description/>
    <Type/>
    <MimeType/>
</OnlineResource>

<ProviderBrowseUrl>
    <URL/>
    <FileSize/>
    <Description/>
    <MimeType/>
</ProviderBrowseUrl>

ISO

The ISO Standards use CI_OnlineResources to describe links. They include the following properties:

<gmd:CI_OnlineResource>
    <gmd:linkage/>
    <gmd:protocol/>
    <gmd:applicationProfile/>
    <gmd:name/>
    <gmd:description/>
    <gmd:function/>
</gmd:CI_OnlineResource>

Connections

All of these approaches to describing online resources include properties that make links more self-explanatory and easier to use. There are some differences that need to be considered when comparing them:

  1. Types and Function Codes - All three dialects include a mechanism for classifying online resources. DIF and ISO use shared vocabularies and ECHO uses free text. The DIF vocabulary is hierarchical and includes roughly 35 choices. The ISO codeList includes 11 broad categories. The ECHO Collection metadata currently includes roughly 45 different values with some overlap with the DIF list. Given the variation in these existing approaches, it seems reasonable to map the DIF and ECHO types into the ISO name element, which is free text, rather than into the function element which is a codeList.

All of these approaches to describing online resources include properties that make links more self-explanatory and easier to use.

Crosswalks

Concept Description Paths
URL Address of the online resource DIF /dif:DIF/dif:Related_URL/URL
ECHO echo:OnlineAccessURLs/echo:OnlineAccessURL/echo:URL
ECHO echo:OnlineResources/echo:OnlineResource/echo:URL
EML /eml:dataset/eml:distribution/eml:online/eml:url/eml:text
FGDC /fgdc:metadata/fgdc:idinfo/fgdc:citation/fgdc:citeinfo/fgdc:onlink | /fgdc:metadata/fgdc:idinfo/fgdc:browse/fgdc:browsen
ISO //gmd:CI_OnlineResource/gmd:linkage/gmd:URL
OGC-SOS /sos:Capabilities/ows:OperationsMetadata/ows:Operation/ows:DCP/ows:HTTP/ows:Get/@xlink:href
OGC-SOS /sos:Capabilities/ows:OperationsMetadata/ows:Operation/ows:DCP/ows:HTTP/ows:Post/@xlink:href
Online Resource Description A brief description of the online resource DIF /dif:DIF/dif:Related_URL/dif:Description
ECHO echo:OnlineAccessURLs/echo:OnlineAccessURL/echo:URLDescription
ISO //gmd:CI_OnlineResource/gmd:description/gco:CharacterString
Online Resource Function A description of the function of the online resource ISO //gmd:CI_OnlineResource/gmd:description/gco:CharacterString
Online Resource Name/Title A name or title of the online resource DIF /dif:DIF/dif:Related_URL/dif:URL_Content_Type/dif:Type > /dif:DIF/dif:Related_URL/dif:URL_Content_Type/dif:Subtype
ISO //gmd:CI_OnlineResource/gmd:name/gco:CharacterString
Format of the Online Resource Identify the format of the online resource ECHO echo:OnlineAccessURLs/echo:OnlineAccessURL/MimeType
ISO //gmd:CI_OnlineResource/gmd:applicationProfile/gco:CharacterString

xPath Note: The xPaths included in this table use several wildcards. // means any path, so //gmd:CI_ResponsibleParty indicates a gmd:CI_ResponsibleParty anywhere in an XML file. /*/ indicates a single level with several possible elements. This usually indicates one of several concrete realizations of an abstract object. For example /*/gmd:identificationInfo could be gmd:MD_Metadata/gmd:identificationInfo or gmi:MI_Metadata/gmd:identificationInfo and gmd:identificationInfo//*/gmd:descriptiveKeywords could be gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords or gmd:identificationInfo/srv:SV_ServiceIdentification/gmd:descriptiveKeywords.