Data Discovery (CSW)

From Earth Science Information Partners (ESIP)
Revision as of 15:49, November 7, 2014 by Ted.Habermann (talk | contribs)

Catalogue Services for the Web (CSW) is an Open Geospatial Consortium (OGC) standard for supporting the ability to publish and search collections of descriptive information (metadata) for data, services, and related information objects. CSW defines Core Queryables, Core Returnables, and Other Queryables that cab be supported in any compliant implementation. Profiles of CSW map these fields to concepts and xPaths for particular metadata dialects.

Catalog Services for the Web (CSW) Core Queryables

The Open Geospatial Consortium Catalog Services for the Web (CSW) standard defines eleven "Core Queryables" that must be supported in any compliant implementation. Profiles of CSW map these queryables to concepts and xPaths for particular metadata dialects.

Concept Description Dialect (Fit) Paths
Keyword A word or phrase that describes some aspect of a resource. Can be one of several types.

Note: The general identification keywords usually have a type of "theme" and are refered to as "theme keywords". Other types and vocabularies are used for other information. Service Entry Resource Format (SERF) requires a Science and a Service GCMD Keyword. This concept is called "Subject" in the CSW Specification.
ISO (1) /*/gmd:identificationInfo/*/gmd:descriptiveKeywords/gmd:MD_Keywords[gmd:type/gmd:MD_KeywordTypeCode='theme']/gmd:keyword/gco:CharacterString
ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/mri:MD_DataIdentification/mri:descriptiveKeywords/mri:MD_Keywords[mri:type/mri:MD_KeywordTypeCode='theme']/mri:keyword/gco:CharacterString
ECHO (1) /*/echo:ScienceKeywords/echo:ScienceKeyword/echo:CategoryKeyword
ECHO (1) /*/echo:ScienceKeywords/echo:ScienceKeyword/echo:TopicKeyword
ECHO (1) /*/echo:ScienceKeywords/echo:ScienceKeyword/echo:TermKeyword
ECHO (1) /*/echo:ScienceKeywords/echo:ScienceKeyword/echo:VariableLevel1Keyword/echo:Value
ECHO (1) /*/echo:ScienceKeywords/echo:ScienceKeyword/echo:VariableLevel2Keyword/echo:Value
ECHO (1) /*/echo:ScienceKeywords/echo:ScienceKeyword/echo:VariableLevel3Keyword
ECHO (1) /*/echo:ScienceKeywords/echo:ScienceKeyword/echo:DetailedVariableKeyword
ECS (1) /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:DisciplineTopicParameters/ecs:DisciplineKeyword
ECS (1) /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:DisciplineTopicParameters/ecs:TopicKeyword
ECS (1) /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:DisciplineTopicParameters/ecs:TermKeyword
ECS (1) /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:DisciplineTopicParameters/ecs:VariableKeyword
DIF (1) /dif:DIF/dif:Parameters/dif:Category
DIF (1) /dif:DIF/dif:Parameters/dif:Topic
DIF (1) /dif:DIF/dif:Parameters/dif:Term
DIF (1) /dif:DIF/dif:Parameters/dif:Variable_Level_1
DIF (1) /dif:DIF/dif:Parameters/dif:Variable_Level_2
DIF (1) /dif:DIF/dif:Parameters/dif:Variable_Level_3
DIF (1) /dif:DIF/dif:Parameters/dif:Detailed_Variable
SERF /serf:SERF/serf:Keyword
Resource Title A short description of the resource. The title should be descriptive enough so that when a user is presented with a list of titles the general content of the data set can be determined. ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:title/gco:CharacterString
ISO-1 /*/mdb:identificationInfo/*/mri:citation/cit:CI_Citation/cit:title/gco:CharacterString
ECHO /*/echo:ShortName>/*/echo:LongName
ECS /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:ShortName> /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:LongName
DIF /dif:DIF/dif:Entry_Title
DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_Title
Abstract A paragraph describing the resource.

Note: This concept is called "Desciption" in Catalog Services for the Web.
ISO /*/gmd:identificationInfo/*/gmd:abstract/gco:CharacterString
ISO-1 /*/mdb:identificationInfo/*/rmd:abstract/gco:CharacterString
ECHO /*/echo:Description
ECS /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:CollectionDescription
DIF /dif:DIF/dif:Summary/dif:Abstract
SERF /serf:SERF/serf:Summary/serf:Abstract
Any Text A target for full-text search of character data types in a catalogue ISO //text()
Resource Format The physical or digital manifestation of the resource ISO /*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorFormat/gmd:MD_Format/gmd:name/gco:CharacterString
ISO /*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributionFormat/gmd:MD_Format/gmd:name/gco:CharacterString
ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/mri:MD_DataIdentification/mri:resourceConstraints/mco:MD_LegalConstraints
ISO-1 /mdb:MD_Metadata/mdb:distributionInfo/mrd:MD_Distribution/mrd:distributionFormat/mrd:MD_Format
ISO-1 /mdb:MD_Metadata/mdb:distributionInfo/mrd:MD_Distribution/mrd:distributor/mrd:MD_Distributor/mrd:distributorFormat/mrd:MD_Format
ECHO /*/echo:DataFormat
DIF /dif:DIF/dif:Distribution/dif:Distribution_Format
SERF /serf:SERF/serf:Distribution/serf:Distribution_Format
Metadata Identifier A phrase or string which uniquely identifies the metadata file/record. ISO /*/gmd:fileIdentifier/gco:CharacterString
ISO-1 /mdb:MD_Metadata/mdb:metadataIdentifier/mcc:MD_Identifier
DIF /dif:DIF/dif:Entry_Id
SERF /serf:SERF/serf:Entry_ID
Modified Date Date on which the record was created or updated within the catalogue ISO /*/gmd:dateStamp/gco:Date
ISO /*/gmd:dateStamp/gco:DateTime
ISO-1 /mdb:MD_Metadata/mdb:dateInfo/cit:CI_Date[cit:dateType/cit:CI_DateTypeCode="lastUpdate"]/cit:date/gco:DateTime
DIF /dif:DIF/dif:DIF_Creation_Date
Type The nature or genre of the content of the resource. Type can include general categories, genres or aggregation levels of content. ISO /*/gmd:hierarchyLevel/gmd:MD_ScopeCode
ISO-1 /mdb:MD_Metadata/mdb:metadataScope/mdb:MD_MetadataScope/mdb:resourceScope/mcc:MD_ScopeCode
Bounding Box A bounding box for identifying a geographic area of interest

Note: This concept is called "Coverage" in the CSW Specification
ISO /*/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox
ISO /*/gmd:identificationInfo/srv:SV_ServiceIdentification/srv:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox
ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/mri:MD_DataIdentification/mri:extent/gex:EX_Extent/gex:geographicElement/gex:EX_GeographicBoundingBox
ECHO /*/echo:Spatial/echo:HorizontalSpatialDomain/echo:Geometry/echo:BoundingRectangle
DIF /dif:DIF/dif:Spatial_Coverage
Coordinate Reference System (CRS) Geographic Coordinate Reference System (Authority and ID) for the BoundingBox ISO /*/gmd:referenceSystemInfo/gmd:MD_ReferenceSystem/gmd:referenceSystemIdentifier/gmd:RS_Identifier/gmd:code
ISO /*/gmd:referenceSystemInfo/gmd:MD_ReferenceSystem/gmd:referenceSystemIdentifier/gmd:RS_Identifier/gmd:codeSpace
ISO /*/gmd:referenceSystemInfo/gmd:MD_ReferenceSystem/gmd:referenceSystemIdentifier/gmd:RS_Identifier/gmd:version
Association Complete statement of a one-to-one relationship ISO /*/gmd:identificationInfo/*/gmd:aggregationInfo/gmd:MD_AggregateInformation/gmd:aggregateDataSetIdentifier/gmd:MD_Identifier/gmd:code/gco:CharacterString
ECS /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:CollectionAssociation


Catalog Services for the Web (CSW) Core Returnable Properties

The Open Geospatial Consortium Catalog Services for the Web (CSW) standard defines eleven "Core Queryables" that must be supported for any compliant implementation. Profiles of CSW map these queryables to concepts and xPaths for particular metadata dialects.

Concept Description Dialect (Fit) Paths
Resource Title A short description of the resource. The title should be descriptive enough so that when a user is presented with a list of titles the general content of the data set can be determined. ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:title/gco:CharacterString
ISO-1 /*/mdb:identificationInfo/*/mri:citation/cit:CI_Citation/cit:title/gco:CharacterString
ECHO /*/echo:ShortName>/*/echo:LongName
ECS /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:ShortName> /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:LongName
DIF /dif:DIF/dif:Entry_Title
DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_Title
Creator Creator of the resource ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[gmd:role/gmd:CI_RoleCode= 'author']
ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[gmd:role/gmd:CI_RoleCode= 'originator']
DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_Creator
Abstract A paragraph describing the resource.

Note: This concept is called "Desciption" in Catalog Services for the Web.
ISO /*/gmd:identificationInfo/*/gmd:abstract/gco:CharacterString
ISO-1 /*/mdb:identificationInfo/*/rmd:abstract/gco:CharacterString
ECHO /*/echo:Description
ECS /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:CollectionDescription
DIF /dif:DIF/dif:Summary/dif:Abstract
SERF /serf:SERF/serf:Summary/serf:Abstract
Publisher Publisher of the cited resource ISO //gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[gmd:role/gmd:CI_RoleCode= 'publisher']
ISO-1 //cit:CI_Citation/cit:citedResponsibleParty/cit:CI_Responsibility[cit:role/cit:CI_RoleCode='publisher']/cit:party/cit:CI_Organisation/cit:name/gco:CharacterString
DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_Publisher
DIF /dif:DIF/dif:Reference/dif:Publisher
Contributor Name Contributor to the resource ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[gmd:role/gmd:CI_RoleCode= 'author']
ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[gmd:role/gmd:CI_RoleCode= 'originator']
Date The date of a creation or update event of the catalogue record. ISO /*/gmd:dateStamp/gco:Date
ISO /*/gmd:dateStamp/gco:DateTime
DIF /dif:DIF/dif:DIF_Creation_Date
Type The nature or genre of the content of the resource. Type can include general categories,genres or aggregation levels of content. ISO /*/gmd:hierarchyLevel/gmd:MD_ScopeCode
ISO-1 /mdb:MD_Metadata/mdb:metadataScope/mdb:MD_MetadataScope/mdb:resourceScope/mcc:MD_ScopeCode
Resource Format The physical or digital manifestation of the resource ISO /*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorFormat/gmd:MD_Format/gmd:name/gco:CharacterString
ISO /*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributionFormat/gmd:MD_Format/gmd:name/gco:CharacterString
ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/mri:MD_DataIdentification/mri:resourceConstraints/mco:MD_LegalConstraints
ISO-1 /mdb:MD_Metadata/mdb:distributionInfo/mrd:MD_Distribution/mrd:distributionFormat/mrd:MD_Format
ISO-1 /mdb:MD_Metadata/mdb:distributionInfo/mrd:MD_Distribution/mrd:distributor/mrd:MD_Distributor/mrd:distributorFormat/mrd:MD_Format
ECHO /*/echo:DataFormat
DIF /dif:DIF/dif:Distribution/dif:Distribution_Format
SERF /serf:SERF/serf:Distribution/serf:Distribution_Format
Metadata Identifier A phrase or string which uniquely identifies the metadata file/record. ISO /*/gmd:fileIdentifier/gco:CharacterString
ISO-1 /mdb:MD_Metadata/mdb:metadataIdentifier/mcc:MD_Identifier
DIF /dif:DIF/dif:Entry_Id
SERF /serf:SERF/serf:Entry_ID
Source A reference to a resource from which the present resource is derived. ISO /*/gmd:dataQualityInfo/gmd:DQ_DataQuality/gmd:lineage/gmd:LI_Lineage/gmd:source/*
ISO-1 /mdb:MD_Metadata/mdb:resourceLineage/mrl:LI_Lineage/mrl:source/mrl:LI_Source
Metadata Language Language of the metadata ISO /*/gmd:language/gco:CharacterString
ISO-1 /mdb:MD_Metadata/mdb:defaultLocale/lan:PT_Locale/lan:language/lan:LanguageCode
Relation A reference to a related resource. ISO /*/gmd:identificationInfo/*/gmd:aggregationInfo/gmd:MD_AggregateInformation
Bounding Box A bounding box for identifying a geographic area of interest

Note: This concept is called "Coverage" in the CSW Specification
ISO /*/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox
ISO /*/gmd:identificationInfo/srv:SV_ServiceIdentification/srv:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox
ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/mri:MD_DataIdentification/mri:extent/gex:EX_Extent/gex:geographicElement/gex:EX_GeographicBoundingBox
ECHO /*/echo:Spatial/echo:HorizontalSpatialDomain/echo:Geometry/echo:BoundingRectangle
DIF /dif:DIF/dif:Spatial_Coverage
Rights Information about rights held in and over the resource ISO /*/gmd:identificationInfo/*/gmd:resourceConstraints/gmd:MD_LegalConstraints
ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/mri:MD_DataIdentification/mri:resourceConstraints/mco:MD_LegalConstraints


Catalog Service for the Web (CSW) Additional Queryable Properties

Additional queryable properties defined in the ISO Profile of CSW (Table 10)

Concept Description Dialect (Fit) Paths
Revision Date Date of revision of the cited resource ISO //gmd:CI_Citation/gmd:date/gmd:CI_Date[gmd:dateType/gmd:CI_DateTypeCode='revision']/gmd:date/gco:Date
ISO //gmd:CI_Citation/gmd:date/gmd:CI_Date[gmd:dateType/gmd:CI_DateTypeCode='revision']/gmd:date/gco:DateTime
ISO-1 //cit:CI_Citation/cit:date/cit:CI_Date[gmd:dateType/gmd:CI_DateTypeCode='revision']/cit:date/gco:Date
ISO-1 //cit:CI_Citation/cit:date/cit:CI_Date[gmd:dateType/gmd:CI_DateTypeCode='revision']/cit:date/gco:DateTime
Publication Date Date of publication of the cited resource ISO //gmd:CI_Citation/gmd:date/gmd:CI_Date[gmd:dateType/gmd:CI_DateTypeCode='publication']/gmd:date/gco:Date
ISO //gmd:CI_Citation/gmd:date/gmd:CI_Date[gmd:dateType/gmd:CI_DateTypeCode='publication']/gmd:date/gco:DateTime
ISO-1 //cit:CI_Citation/cit:date/cit:CI_Date[gmd:dateType/gmd:CI_DateTypeCode='publication']/cit:date/gco:Date
DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_Release_Date
DIF /dif:DIF/dif:Reference/dif:Publication_Date
Organization Name Name of the organization ISO (1) //gmd:CI_ResponsibleParty/gmd:organisationName/gco:CharacterString
ISO-1 (1) //cit:CI_Responsibility/cit:party/cit:CI_Organisation/cit:name/gco:CharacterString
ECHO (1) /*/echo:Contacts/echo:Contact/echo:OrganizationName
ECS (1) /ecs:CollectionMetaDataFile/ecs:CollectionMetaDataSets/ecs:Collections/ecs:CollectionMetaData/ecs:Contact/ecs:ContactOrganizationName
DIF (1) /dif:DIF/dif:Data_Center/dif:Data_Center_Name/dif:Short_Name
DIF (1) /dif:DIF/dif:Data_Center/dif:Data_Center_Name/dif:Long_Name
Has Security Constraints Are there any security constraints? ISO /*/gmd:identificationInfo//*/gmd:resourceConstraints/gmd:MD_SecurityConstraints
ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/mri:MD_DataIdentification/mri:resourceConstraints/mco:MD_SecurityConstraints
DIF /dif:DIF/dif:Access_Constraints
DIF /dif:DIF/dif:Use_Constraints
Metadata Language Language of the metadata ISO /*/gmd:language/gco:CharacterString
ISO-1 /mdb:MD_Metadata/mdb:defaultLocale/lan:PT_Locale/lan:language/lan:LanguageCode
Resource Identifier Identifier for the resource described by the metadata ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:identifier/gmd:MD_Identifier/gmd:code/gco:CharacterString
ISO-1 //cit:CI_Citation/cit:identifier/cit:MD_Identifier/cit:code
ECHO /*/echo:DataSetId
ECHO (1) /*/echo:ShortName | /*/echo:LongName
DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_DOI
DIF /dif:DIF/dif:Reference/dif:DOI
Parent Identifier A unique identifier for a parent dataset or collection ISO /*/gmd:parentIdentifier/gco:CharacterString
DIF /dif:DIF/dif:Parent_DIF
Keyword Type Methods used to group similar keywords ISO /*/gmd:identificationInfo/*/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:type/gmd:MD_KeywordTypeCode
ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/mri:MD_DataIdentification/mri:descriptiveKeywords/mri:MD_Keywords/mri:type/mri:MD_KeywordTypeCode
DIF /dif:DIF/dif:Related_URL/dif:URL_Content_Type/dif:Type > /dif:DIF/dif:Related_URL/dif:URL_Content_Type/dif:Subtype


xPath Note: The xPaths included in this table use several wildcards. // means any path, so //gmd:CI_ResponsibleParty indicates a gmd:CI_ResponsibleParty anywhere in an XML file. /*/ indicates a single level with several possible elements. This usually indicates one of several concrete realizations of an abstract object. For example /*/gmd:identificationInfo could be gmd:MD_Metadata/gmd:identificationInfo or gmi:MI_Metadata/gmd:identificationInfo and gmd:identificationInfo//*/gmd:descriptiveKeywords could be gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords or gmd:identificationInfo/srv:SV_ServiceIdentification/gmd:descriptiveKeywords. Fit: The fit of the dialect path with the concept is estimated on a scale of 1 = excellent two-way fit, 2 = one-way fit or some other problem, 3 - extension required.