Attribute Convention for Data Discovery 1-2 Working

From Earth Science Information Partners (ESIP)
Revision as of 14:24, May 3, 2013 by Graybeal (talk | contribs) (Created page with "''The under-development version of the ACDD definitions is at Attribute_Convention_for_Data_Discovery_(ACDD)_Working.'' ---- =Highly Recommended= <table width="95%" bor...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The under-development version of the ACDD definitions is at Attribute_Convention_for_Data_Discovery_(ACDD)_Working.



Highly Recommended

Attribute Description THREDDS ISO 19115-2 OGC CSW Rubric Category
title
A short description of the dataset.
dataset@name
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:title/gco:CharacterString (M)
Title Text Search
summary
A paragraph describing the dataset.
metadata/documentation[@type="summary"]
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:abstract/gco:CharacterString (M)
Abstract Text Search
keywords
A comma separated list of key words and phrases.
metadata/keyword
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:keyword/gco:CharacterString
Subject Text Search

Recommended

Attribute Description THREDDS ISO 19115-2 OGC CSW Rubric Category
id
The

combination of the "naming authority" and the "id" should be a globally unique identifier for the dataset.

dataset@id
/gmi:MI_Metadata/gmd:fileIdentifier/gco:CharacterString (O)
Identifier Identifier
naming_authority
keywords_vocabulary
If you are following a guideline for the words/phrases in your "keywords" attribute, put the name of that guideline here.
metadata/keyword@vocabulary /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString
Text Search
cdm_data_type
The THREDDS data type appropriate for this dataset. metadata/dataType /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:spatialRepresentationType/gmd:MD_SpatialRepresentationTypeCode
May need some extensions to this codelist. Current values: vector, grid, textTable, tin, stereoModel, video.
Other
history
Provides an audit trail for modifications to the original data. metadata/documentation[@type="history"] /gmi:MI_Metadata/gmd:dataQualityInfo/gmd:DQ_DataQuality/gmd:lineage/gmd:LI_Lineage/gmd:statement/gco:CharacterString (O) Text Search
comment
Miscellaneous information about the data. metadata/documentation
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:supplementalInformation
Text Search
date_created The date on which the data was created.
metadata/date[@type="created"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:date/gmd:CI_Date/gmd:date/gco:Date (M)
/gmd:dateType/gmd:CI_DateTypeCode="creation"
Responsible Party
creator_name
The data creator's name, URL, and email. The "institution" attribute will be used if the "creator_name" attribute does not exist.
metadata/creator/name
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:individualName/gco:CharacterString
CI_RoleCode="originator" (O)
Responsible Party
creator_url
metadata/creator/contact@url
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:onlineResource/gmd:CI_OnlineResource/gmd:linkage/gmd:URL
Responsible Party
creator_email
metadata/creator/contact@email /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString Responsible Party
institution
metadata/creator/name /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:organisationName/gco:CharacterString Responsible Party
project
The scientific project that produced the data.
metadata/project
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:keyword/gco:CharacterString
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:aggregationInfo/gmd:MD_AggregateInformation/gmd:aggregateDataSetName/gmd:CI_Citation/gmd:title/gco:CharacterString
DS_AssociationTypeCode="largerWorkCitation" and DS_InitiativeTypeCode="project"
and/or

/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:keyword/gco:CharacterString with gmd:MD_KeywordTypeCode="project"

Responsible Party
processing_level A textual description of the processing (or quality control) level of the data.
metadata/documentation[@type="processing_level"]
acknowledgementA place to acknowledge various type of support for the project that produced this data.
metadata/documentation[@type="funding"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:credit/gco:CharacterString Responsible Party
geospatial_bounds Describes geospatial extent using any of the geometric objects (2D or 3D) supported by the Well-Known Text (WKT) format. BoundingPolygon Extent
geospatial_lat_min
Describes a simple latitude/longitude bounding box. geospatial_lat_min specifies the southernmost latitude; geospatial_lat_max specifies the northernmost latitude; geospatial_lon_min specifies the westernmost longitude; geospatial_lon_max specifies the easternmost longitude of the bounding box.
The values of geospatial_lon_min and geospatial_lon_max reflect the actual longitude data values. Cases where geospatial_lon_min is greater than geospatial_lon_max indicate the bounding box extends from geospatial_lon_max, through the longitude range discontinuity meridian (either the antimeridian or Prime Meridian), to geospatial_lon_min.
For a more detailed geospatial coverage, see the suggested geospatial attributes.
metadata/geospatialCoverage/northsouth/start
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:southBoundLatitude/gco:Decimal
BoundingBox Extent
geospatial_lat_max metadata/geospatialCoverage/northsouth/size /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:northBoundLatitude/gco:Decimal
BoundingBox Extent
geospatial_lon_min metadata/geospatialCoverage/eastwest/start /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:westBoundLongitude/gco:Decimal
BoundingBox Extent
geospatial_lon_max metadata/geospatialCoverage/eastwest/size /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:eastBoundLongitude/gco:Decimal
BoundingBox Extent
geospatial_vertical_min
Describes a simple vertical bounding box. For a more detailed geospatial coverage, see the suggested geospatial attributes. metadata/geospatialCoverage/updown/start /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:verticalElement/gmd:EX_VerticalExtent/gmd:minimumValue/gco:Real Extent
geospatial_vertical_max metadata/geospatialCoverage/updown/size /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:verticalElement/gmd:EX_VerticalExtent/gmd:maximumValue/gco:Real Extent
time_coverage_start Describes the temporal coverage of the data as a time range. metadata/timeCoverage/start /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent/gml:TimePeriod/gml:beginPosition Extent
time_coverage_end metadata/timeCoverage/end /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent/gml:TimePeriod/gml:endPosition Extent
time_coverage_duration metadata/timeCoverage/duration /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent/gml:TimePeriod/gml:beginPosition provides an ISO8601 compliant description of the time period covered by the dataset. This standard supports descriptions of durations. Extent
time_coverage_resolution metadata/timeCoverage/resolution Extent
standard_name_vocabulary
The name of the controlled vocabulary from which variable standard names are taken.
metadata/variables@vocabulary /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString
Text Search
license Describe the restrictions to data access and distribution. metadata/documentation[@type="rights"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:resourceConstraints/gmd:MD_LegalConstraints/gmd:useLimitation/gco:CharacterString

Suggested

Attribute Description THREDDS ISO 19115-2 OGC CSW Rubric Category
contributor_name
The name and role of any individuals or institutions that contributed to the creation of this data.
metadata/contributor
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:individualName/gco:CharacterString
Responsible Party
contributor_role
metadata/contributor@role /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:role/gmd:CI_RoleCode
="principalInvestigator" | "author"
Responsible Party
publisher_name
The data publisher's name, URL, and email. The publisher may be an individual or an institution. metadata/publisher/name
/gmi:MI_Metadata/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty/gmd:organisationName/gco:CharacterString
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:individualName/gco:CharacterString
CI_RoleCode="publisher"
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:keyword/gco:CharacterString with gmd:MD_KeywordTypeCode="dataCenter"
Responsible Party
publisher_url
metadata/publisher/contact@url
/gmi:MI_Metadata/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:onlineResource/gmd:CI_OnlineResource/gmd:linkage/gmd:URL
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:onlineResource/gmd:CI_OnlineResource/gmd:linkage/gmd:URL
CI_RoleCode="publisher"
Responsible Party
publisher_email
metadata/publisher/contact@email /gmi:MI_Metadata/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString
CI_RoleCode="publisher"
Responsible Party
date_modified
The date on which this data was last modified.
metadata/date[@type="modified"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:date/gmd:CI_Date/gmd:date/gco:Date
/gmd:dateType/gmd:CI_DateTypeCode="revision"
ModifiedResponsible Party
date_issued
The date on which this data was formally issued.
metadata/date[@type="issued"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:date/gmd:CI_Date/gmd:date/gco:Date
/gmd:dateType/gmd:CI_DateTypeCode="publication"
Responsible Party
geospatial_lat_units
Further refinement of the geospatial bounding box can be provided by using these units and resolution attributes.
metadata/geospatialCoverage/northsouth/units /gmi:MI_Metadata/gmd:spatialRepresentationInfo/gmd:MD_Georectified/gmd:axisDimensionProperties/gmd:MD_Dimension/gmd:resolution/gco:Measure/@uom Extent
geospatial_lat_resolution metadata/geospatialCoverage/northsouth/resolution /gmi:MI_Metadata/gmd:spatialRepresentationInfo/gmd:MD_Georectified/gmd:axisDimensionProperties/gmd:MD_Dimension/gmd:resolution/gco:Measure Extent
geospatial_lon_units
metadata/geospatialCoverage/eastwest/units /gmi:MI_Metadata/gmd:spatialRepresentationInfo/gmd:MD_Georectified/gmd:axisDimensionProperties/gmd:MD_Dimension/gmd:resolution/gco:Measure/@uom Extent
geospatial_lon_resolution metadata/geospatialCoverage/eastwest/resolution /gmi:MI_Metadata/gmd:spatialRepresentationInfo/gmd:MD_Georectified/gmd:axisDimensionProperties/gmd:MD_Dimension/gmd:resolution/gco:Measure Extent
geospatial_vertical_units
metadata/geospatialCoverage/updown/units /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:verticalElement/gmd:EX_VerticalExtent/gmd:verticalCRS Extent
geospatial_vertical_resolution
metadata/geospatialCoverage/updown/resolution
Extent
geospatial_vertical_positive
metadata/geospatialCoverage@zpositive
Extent

Highly Recommended Variable Attributes

Attribute Description THREDDS ISO 19115-2
long_name A long descriptive name for the variable (not necessarily from a controlled vocabulary). metadata/variables/variable@vocabulary_name At present the ISO 19115-2 Standard supports only one name for a variable. Standard names can be provided as keywords with the appropriate thesaurus.
standard_name
A long descriptive name for the variable taken from a controlled vocabulary of variable names. metadata/variables/variable@vocabulary_name
units The units of the variables data values. This attributes value should be a valid udunits string. metadata/variables/variable@units /gmi:MI_Metadata/gmd:contentInfo/gmi:MI_CoverageDescription/gmd:dimension/gmd:MD_Band/gmd:units
coverage_content_type An ISO 19115-1 code to indicate the source of the data. The valid values in the MD_CoverageContentTypeCode list are image, thematicClassification, physicalMeasurement, auxiliaryInformation, qualityInformation, referenceInformation, modelResult, coordinate




ISO Translation Notes

The translation between the Attribute Conventions for Data Discovery is subject to a number of assumptions or conventions described here.

People

The ACDD includes several types of people:

ACDD Attributes ISO Locations
creator_name, creator_email, creator_url, institution citation/citedResponsibleParty role=originator, point of contact, and metadata contact
contributor_name, contributor_role citation/citedResponsibleParty role=originator (may need adjustment)
publisher_name, publisher_email, publisher_url distributor and Data Center keyword
project Project keyword, aggregation information (initiative type = project)

Keywords

The ACDD includes several attributes that make sense as keywords in ISO:

ACDD Attributes ISO Locations
keywords theme keywords with thesaurus given by the keywords_vocabulary attribute
project Project keyword with unknown thesaurus and aggregation information (initiative type = project)
publisher_name Data Center keyword with unknown thesaurus
standard_names for parameters theme keywords with thesaurus = standard_name_vocabulary
publisher_name, publisher_email, publisher_url distributor and Data Center keyword

Translation Revisions

Several changes were introduced into Version 2.0.2 of the stylesheet for transforming NcML to ISO in order to improve the rubric score for the resulting ISO metadata. The changes included:

  1. Including netcdf/@location in transform as distribution onlineResource
  2. Added tagname to writeResponsibleParty so that responsibleParties with

different UML roles could be supported (i.e. contact vs. distributor)

  1. Added urlName and urlDescription to writeResponsibleParty to add

content to the onlineResource

  1. Moved publisher from citation to distributor and included publisher_name as a dataCenter keyword.
  2. Added project as a keyword with type=project
  3. Added distributionInfo section to ISO if publisher or location exist.

Determining an Order of Precedence

There can be conflicting information available from different sources within the THREDDS and CDM data models. The diagram below seeks to determine an order of precedence for what is recorded in the ncISO metadata when those attributes conflict. Metadataprecedence.png

A key part of this discussion is the ability to see identify potentially conflicting metdata between the differenc sources within THREDDS and NetCDF. Below we propose using groups to identify in the NCML what sources contain the relevant metadata that will be used in the ISO translation.

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2 ../XSD/ncml-2.2.xsd"
 xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
 location="http://localhost:8080/thredds/dodsC/test/crm_v1.nc">
    
    <!-- Metadata from the netCDF or NCML file global attributes -->    
    <attribute name="Conventions" value="CF-1.4" />
    <attribute name="title" value="crm_v1.grd" />
    <attribute name="history" value="xyz2grd -R-80/-64/40/48 -I3c -Gcrm_v1.grd" />
    <attribute name="GMT_version" value="4.5.1 [64-bit]" />
    <attribute name="creator_name" value="David Neufeld"/>
    <attribute name="creator_email" value="David.Neufeld@noaa.gov"/>    
    <attribute name="geospatial_lon_units" value="degrees_east" />
    <attribute name="geospatial_lat_units" value="degrees_north" />
    <attribute name="geospatial_lon_min" type="float" value="-80.0" />
    <attribute name="geospatial_lon_max" type="float" value="-64.0" />    
    <attribute name="geospatial_lat_max" type="float" value="48.0" />
    <attribute name="geospatial_lat_min" type="float" value="40.0" />
    <attribute name="geospatial_lon_resolution" type="double" value="8.33E-4" />
    <attribute name="geospatial_lat_resolution" type="double" value="8.33E-4" />
    
    <!-- Metadata calculated from the netCDF file axes based on CF conventions -->
    <group name="CFMetadata">
      <attribute name="geospatial_lon_min" value="-80.0" type="float" />
      <attribute name="geospatial_lat_min" value="40.0" type="float" />
      <attribute name="geospatial_lon_max" value="-64.0" type="float" />
      <attribute name="geospatial_lat_max" value="48.0" type="float" />
      <attribute name="geospatial_lon_units" value="degrees_east" />    
      <attribute name="geospatial_lat_units" value="degrees_north" />
      <attribute name="geospatial_lon_resolution" value="8.332899328159992E-4" />
      <attribute name="geospatial_lat_resolution" value="8.332465368190813E-4" />
    </group>
    
    <!-- Metadata from the THREDDS catalog dataset -->
    <group name="THREDDSMetadata">
        <attribute name="id" value="crm_v1" />
        <attribute name="creator_name" value="David Neufeld"/>
        <attribute name="creator_email" value="David.Neufeld@noaa.gov"/>    
        <attribute name="data_distribution" value="http://localhost:8080/thredds/dodsC/test/crm_v1.nc" />
        <attribute name="wms_service" value="http://localhost:8080//thredds/wms/crm/crm_vol9.nc" />
        <attribute name="wcs_service" value="http://localhost:8080//thredds/wcs/crm/crm_vol9.nc" />
    </group>    
        
    <!-- Metadata from the ncISO service -->
    <group name="NCISOMetadata">
      <attribute name="metadata_creation" value="2011-04-19" />
    </group>
    
    <dimension name="x" length="19201" />
    <dimension name="y" length="9601" />
    
    <variable name="z" shape="y x" type="float">
        <attribute name="long_name" value="z" />
        
        <attribute name="_FillValue" type="float" value="NaN" />
        <attribute name="actual_range" type="double" value="-2754.39990234375 1903.0" />
        <attribute name="units" value="meters" />
        <attribute name="positive" value="up" />
    </variable>
    <variable name="x" shape="x" type="double">
        <attribute name="long_name" value="x" />
        <attribute name="actual_range" type="double" value="-80.0 -64.0" />
        <attribute name="units" value="degrees_east" />
        
        <attribute name="_CoordinateAxisType" value="Lon" />
    </variable>
    <variable name="y" shape="y" type="double">
        <attribute name="long_name" value="y" />
        <attribute name="actual_range" type="double" value="40.0 48.0" />
        <attribute name="units" value="degrees_north" />
        <attribute name="_CoordinateAxisType" value="Lat" />
    </variable>
</netcdf>