Data Discovery (DataCite 4.0)
DataCite is an organization whose objective is to develop and support methods of locating, identifying and citing data and other research objects. They develop and support the standards behind persistent identifiers for data; and are the originators of Digital Object Identifiers (DOIs).
In the context of the terminology we use (described below), DataCite is an organization that created a set of recommendations at three levels (described in the schema description document) and an XML schema (a dialect) for implementing those recommendations. The dialect is currently being used in the DataCite search portal and in creating DOI landing pages. The dialect can be useful to communities that are trying to improve the way they share metadata; and the recommendations are useful for communities looking for expert guidance about metadata elements that are applicable to data discovery. The work we are doing explores how those recommendations can be useful for communities that are already using other dialects.
The DataCite Metadata Schema is a list of metadata elements DataCite feels are necessary to maintain accurate and consistent identification of a resource for citation and retrieval purposes. The DataCite Metadata Schema also supplies recommended use instructions. The resource that is being identified can be of any kind, but it is typically a dataset. *Note: DataCite uses the term "dataset" in its broadest sense - including both numerical data, as well as any alternate research data outputs. The recommendation has three parts (termed spirals): mandatory concepts, recommended concepts, and optional concepts. From the schema:
"There are three different levels of obligation for the metadata properties:
● Mandatory (M) properties must be provided,
● Recommended (R ) properties are optional, but strongly recommended for interoperability and
● Optional (O) properties are optional and provide richer description.
Those clients who wish to enhance the prospects that their metadata will be found, cited and linked to original research are strongly encouraged to submit the Recommended as well as Mandatory set of properties. Together, the Mandatory and Recommended set of properties and their sub‐properties are especially valuable to information seekers and added‐service providers, such as indexers. The Metadata Working Group members strongly urge the inclusion of metadata identified as Recommended for the purpose of achieving greater exposure for the resource’s metadata record, and therefore, the underlying research itself."
DataCite_4_Mandatory
"Mandatory (M) properties must be provided."
Source: DataCite Metadata Schema 4.0Concept | Description | Dialect (Fit) Paths |
---|---|---|
Resource Identifier | Identifier for the resource described by the metadata | ADIwg /adiwg:project/adiwg:idinfo/adiwg:ids/adiwg:projguid DCAT /dct:identifier DCITE /dcite:resource/dcite:identifier | /dcite:resource/dcite:alternateIdentifiers/dcite:alternateIdentifier DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_DOI DIF /dif:DIF/dif:Entry_ID/dif:Short_Name DIF-10 /dif:DIF/dif:Entry_ID/dif:Short_Name DIF-10 /dif:DIF/dif:Dataset_Citation[dif:Persistent_Identifier/dif:Type='DOI']/dif:Persistent_Identifier/dif:Identifier Dryad /*/dcterms:identifier ECHO /*/echo:DataSetId ECHO /*/echo:ShortName | /*/echo:LongName ECHO /*/echo:GranuleUR ECS /ecs:LocalGranuleID EML /eml:eml/@packageId HDF5.1 /hdf5:HDF5-File/hdf5:RootGroup/hdf5:Group[@Name='METADATA']/hdf5:Group[@Name='INVENTORYMETADATA']/hdf5:Group[@Name='ProductSpecificMetadata']/hdf5:Attribute[@Name='identifier_file_uuid']/hdf5:Data/hdf5:DataFromFile HDF5.1 /hdf5:HDF5-File/hdf5:RootGroup/hdf5:Attribute[@Name='identifier_file_uuid']/hdf5:Data/hdf5:DataFromFile ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:identifier/gmd:MD_Identifier/gmd:code//* ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/*/mri:citation/cit:CI_Citation/cit:identifier/mcc:MD_Identifier/mcc:code//* MODS //mods:mods/mods:identifier Mercury /mercury:metadata/mercury:mercury/mercury:File_ID Mercury /mercury:metadata/mercury:Local-Control-Number Onedcx /onedcx:metadata/onedcx:simpleDc/dcterms:identifier RDA-CISL /rda:dsOverview/@ID RDA-CISL /rda:dsOverview/@ID RDA-CISL /rda:dsOverview/rda:doi THREDDS /thredds:catalog/thredds:dataset/@ID |
Resource Identifier Type | The type of identifier used to uniquely identify the resource. | DCITE /dcite:resource/dcite:identifier/@identifierType EML /eml:eml/@system MODS //mods:mods/mods:identifier/@type Onedcx /onedcx:metadata/onedcx:simpleDc/dcterms:identifier/@type |
Author / Originator | The principal author of the resource Note: In CSW this concept is called Creator | BDP /bdp:metadata/bdp:idinfo/bdp:citation/bdp:citeinfo/bdp:origin CSDGM /csdgm:metadata/csdgm:idinfo/csdgm:citation/csdgm:citeinfo/csdgm:origin DCITE /dcite:resource/dcite:creators/dcite:creator/dcite:affiliation DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_Creator DIF-10 /dif:DIF/dif:Dataset_Citation/dif:Dataset_Creator Dryad /*/dcterms:creator ECHO /*/echo:Contacts/echo:Contact[Role='Data Originator'] ECHO /*/echo:Contacts/echo:Contact[Role='Producer'] ECHO /*/echo:Contacts/echo:Contact[Role='Investigator'] ECHO /*/echo:Contacts/echo:Contact[Role='investigator'] ECHO /*/echo:Contacts/echo:Contact[Role='INVESTIGATOR'] ECS /ecs:Author EML /eml:eml/*/creator HCLS dct:creator HDF5.1 /hdf5:HDF5-File/hdf5:RootGroup/hdf5:Attribute[@Name='creator_name']/hdf5:Data/hdf5:DataFromFile ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[normalize-space(gmd:role/gmd:CI_RoleCode)='author'] ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[normalize-space(gmd:role/gmd:CI_RoleCode)='originator'] ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[normalize-space(gmd:role/gmd:CI_RoleCode)='principalInvestigator'] ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/*/mri:citation/cit:CI_Citation/cit:citedResponsibleParty/cit:CI_Responsibility[normalize-space(cit:role/cit:CI_RoleCode)='author'] ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/*/mri:citation/cit:CI_Citation/cit:citedResponsibleParty/cit:CI_Responsibility[normalize-space(cit:role/cit:CI_RoleCode)='originator'] MODS //mods:mods/mods:name/mods:role[roleTerm='author'] MODS //mods:mods/mods:name/mods:role[roleTerm='creator'] MODS //mods:mods/mods:name/mods:role[roleTerm='originator'] Mercury /mercury:metadata/mercury:idinfo/mercury:citation/mercury:citeinfo/mercury:Principal_Investigator/mercury:Name Mercury /mercury:metadata/mercury:idinfo/mercury:citation/mercury:citeinfo/mercury:origin Mercury /mercury:metadata/mercury:mercury/mercury:Principal_Investigator/mercury:Name Onedcx /onedcx:metadata/onedcx:simpleDc/dcterms:creator RDA-CISL /rda:dsOverview/rda:author THREDDS //thredds:dataset/thredds:creator/thredds:name |
Resource Title | A short description of the resource. The title should be descriptive enough so that when a user is presented with a list of titles the general content of the data set can be determined. | ADIwg /adiwg:project/adiwg:idinfo/adiwg:citation/adiwg:citeinfo/adiwg:title BDP /bdp:metadata/bdp:idinfo/bdp:citation/bdp:citeinfo/bdp:title CSDGM /csdgm:metadata/csdgm:idinfo/csdgm:citation/csdgm:citeinfo/csdgm:title DCAT /dct:title DCITE /dcite:resource/dcite:titles/dcite:title DIF /dif:DIF/dif:Entry_Title DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_Title DIF-10 /dif:DIF/dif:Entry_Title DIF-10 /dif:DIF/dif:Dataset_Citation/dif:Dataset_Title Dryad /*/dcterms:title ECHO /*/echo:ShortName | /*/echo:LongName ECHO /*/echo:DataSetId ECS /*/ecs:ShortName | /*/ecs:LongName EML /eml:eml/*/title HCLS dct:title HDF5.1 /hdf5:HDF5-File/hdf5:RootGroup/hdf5:Attribute[@Name='title']/hdf5:Data/hdf5:DataFromFile HDF5.1 /hdf5:HDF5-File/hdf5:RootGroup/hdf5:Group[@Name='METADATA']/hdf5:Group[@Name='COLLECTIONMETADATA']/hdf5:Attribute[@Name='LongName']/hdf5:Data/hdf5:DataFromFile ISO /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:title//* ISO-1 /mdb:MD_Metadata/mdb:identificationInfo/*/mri:citation/cit:CI_Citation/cit:title//* MODS //mods:mods/mods:titleInfo/mods:title Mercury /mercury:metadata/mercury:idinfo/mercury:citation/mercury:citeinfo/mercury:title OGC-SOS /sos:Capabilities/ows:ServiceIdentification/ows:Title Onedcx /onedcx:metadata/onedcx:simpleDc/dcterms:title RDA-CISL /rda:dsOverview/rda:title SERF /serf:SERF/serf:Entry_Title THREDDS /thredds:catalog/thredds:dataset/@name THREDDS /thredds:catalog/thredds:dataset/thredds:metadata/dc:title THREDDS //thredds:dataset[1]/@name UMM /umm:UMM/umm:CollectionCitation/umm:Title |
Publisher | Publisher of the cited resource | BDP /bdp:metadata/bdp:idinfo/bdp:citation/bdp:citeinfo/bdp:pubinfo/bdp:publish CSDGM /csdgm:metadata/csdgm:idinfo/csdgm:citation/csdgm:citeinfo/csdgm:pubinfo/csdgm:publish DCAT /dct:publisher DCITE /dcite:resource/dcite:publisher DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_Publisher DIF /dif:DIF/dif:Reference/dif:Publisher DIF-10 /dif:DIF/dif:Dataset_Citation/dif:Dataset_Publisher DIF-10 /dif:DIF/dif:Reference/dif:Publisher EML /eml:eml/*/publisher HCLS dct:publisher HDF5.1 /hdf5:HDF5-File/hdf5:RootGroup/hdf5:Attribute[@Name='publisher']/hdf5:Data/hdf5:DataFromFile ISO //gmd:CI_ResponsibleParty[normalize-space(gmd:role/gmd:CI_RoleCode)='publisher']/gmd:organisationName//* ISO-1 //cit:CI_Responsibility[normalize-space(cit:role/cit:CI_RoleCode)='publisher']/cit:party/cit:CI_Organisation/cit:name//* MODS //mods:mods/mods:originInfo/mods:publisher Mercury /mercury:metadata/mercury:idinfo/mercury:citation/mercury:citeinfo/mercury:pubinfo/mercury:publish Onedcx /onedcx:metadata/onedcx:simpleDc/dcterms:publisher RDA-CISL /rda:dsOverview/rda:creator THREDDS //thredds:dataset/thredds:publisher/thredds:name THREDDS //thredds:metadata/thredds:publisher/thredds:name |
Publication Date | Date of publication of the cited resource | BDP /bdp:metadata/bdp:idinfo/bdp:citation/bdp:citeinfo/bdp:pubdate CSDGM /csdgm:metadata/csdgm:idinfo/csdgm:citation/csdgm:citeinfo/csdgm:pubdate DCAT /dct:issued DCITE /dcite:resource/dcite:publicationYear DIF /dif:DIF/dif:Data_Set_Citation/dif:Dataset_Release_Date DIF /dif:DIF/dif:Reference/dif:Publication_Date DIF-10 /dif:DIF/dif:Dataset_Citation/dif:Dataset_Release_Date DIF-10 /dif:DIF/dif:Reference/dif:Publication_Date Dryad /*/dcterms:available ECHO /*/echo:InsertTime ECS /*ecs:CitationforExternalPublication EML /eml:eml/*/pubDate ISO //gmd:CI_Citation/gmd:date/gmd:CI_Date[normalize-space(gmd:dateType/gmd:CI_DateTypeCode)='publication']/gmd:date/gco:Date ISO //gmd:CI_Citation/gmd:date/gmd:CI_Date[normalize-space(gmd:dateType/gmd:CI_DateTypeCode)='publication']/gmd:date/gco:DateTime ISO-1 //cit:CI_Citation/cit:date/cit:CI_Date[cit:dateType/cit:CI_DateTypeCode)='publication']/cit:date/gco:DateTime MODS //mods:mods/mods:originInfo/mods:dateIssued Onedcx /onedcx:metadata/onedcx:dcTerms/dcterms:dateSubmitted RDA-CISL /rda:dsOverview/rda:publicationDate |
Resource Type | A resource code identifying the type of resource; e.g. dataset, a collection, an application (See MD_ScopeCode) for which the metadata describes. | BDP /bdp:metadata/bdp:distinfo/bdp:resdesc CSDGM /csdgm:metadata/csdgm:distinfo/csdgm:resdesc DCITE /dcite:resource/dcite:resourceType/@resourceTypeGeneral Dryad /*/dcterms:type EML /eml:eml/*/physical/dataFormat HCLS dct:Dataset | void:Dataset ISO /*/gmd:hierarchyLevel/gmd:MD_ScopeCode ISO-1 /mdb:MD_Metadata/mdb:metadataScope/mdb:MD_MetadataScope/mdb:resourceScope/mcc:MD_ScopeCode MODS //mods:mods/mods:typeOfResource Mercury /mercury:metadata/mercury:distinfo/mercury:resdesc Onedcx /onedcx:metadata/onedcx:simpleDc/dcterms:type RDA-CISL /rda:dsOverview/rda:contentMetadata/rda:dataType |
xPath Note: The xPaths included in this table use several wildcards. // means any path, so //gmd:CI_ResponsibleParty indicates a gmd:CI_ResponsibleParty anywhere in an XML file. /*/ indicates a single level with several possible elements. This usually indicates one of several concrete realizations of an abstract object. For example /*/gmd:identificationInfo could be gmd:MD_Metadata/gmd:identificationInfo or gmi:MI_Metadata/gmd:identificationInfo and gmd:identificationInfo/*/gmd:descriptiveKeywords could be gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords or gmd:identificationInfo/srv:SV_ServiceIdentification/gmd:descriptiveKeywords. Fit: The fit of the dialect path with the concept is estimated on a scale of 1 = excellent two-way fit, 2 = one-way fit or some other problem, 3 - extension required.