Attribute Conventions for Data Discovery 2.0

From Earth Science Information Partners (ESIP)
Revision as of 14:43, September 2, 2014 by Amilan (talk | contribs) (→‎Recommended)
       DRAFT - Not ready for use

Version and Status

This version is designated as Version 2.0 This page always has the current version of the Attribute Convention for Data Discovery (ACDD). As it is updated, the version number at the top of the page will be updated.

See the [category page] for information on the history of this convention.

Development

Any development version of the ACDD definitions is maintained at Attribute_Convention_for_Data_Discovery_(ACDD)_Working.

Overview

The NetCDF Group at Unidata has recommended attributes for data discovery . The Attribute Convention for Data Discovery (ACDD) addresses that need, providing definitions for NetCDF global attributes that will help data to be located efficiently.

Alignment with NetCDF and CF Conventions

The NetCDF User Guide (NUG) provides basic recommendations for creating NetCDF files; the NetCDF Climate and Forecast Metadata Conventions (CF) provides more specific guidance. The ACDD builds upon and is compatible with these conventions; it may refine the definition of some terms in those conventions, but does not preclude the use of any attributes defined by the NUG or CF.

The NUG does not require any global attributes, though it recommends and defines two, title and history; CF specifies many more. ACDD 1.2 adopts all CF 1.6 global attributes with the exception of 'institution'; we specify 'creator_institution' and 'publisher_institution', to provide more provenance information. We also modify the syntax of the 'Conventions' attribute; we adopt the NUG recommendation to supply all conventions in a single attribute. This change has been approved by the CF Conventions Committee and will be part of CF 1.7, which is not yet published.

Attribute Crosswalks

Many of these attributes correspond to general discovery metadata content, so they are available in many metadata standards. This Unidata crosswalk to THREDDS page includes also includes a crosswalk to ISO 19115-2. Note that the attribute names link to the Unidata definitions. Many of these elements are included in the ISO 19115 Core specification. They are indicated in this Table by an M, O, or C in parentheses. An “M” indicates that the element is mandatory. An “O” indicates that the element is optional. A “C” indicates that the element is mandatory under certain conditions.

Additional Metadata: metadata_link attribute

Other metadata dialects (i.e. ISO 19115) can provide information about collections and more details about the dataset. If additional metadata exists, you can make users aware of it by adding a global attribute named "metadata_link" to the netCDF file. The value of this attribute is a URL that gives the location of the more complete metadata.


Maintenance of Metadata

ACDD attributes, like all NetCDF attributes, characterize their containing (parent) granules. As NetCDF data are processed (e.g., through subsetting or other algorithms), these characteristics can be altered. The software or user processor is responsible to update these attributes as part of the processing, but some software processes and user practices leave them unchanged. This affects both consumers and producers of these files, which comprises three roles:

  • developers of software tools that process NetCDF files;
  • users that create new NetCDF files from existing ones; and
  • end users of NetCDF files.

NetCDF file creators (the first two roles) should ensure that the attributes of output files accurately represent those files, and specifically should not "pass through" any source attribute in unaltered form, unless it is known to remain accurate. NetCDF file users (all three roles) should verify critical attribute values, and understand how the source data and metadata were generated, to be confident the source metadata is current.

The ACDD geospatiotemporal attributes present a special case, as this information is already fully defined by the CF coordinate variables (the redundant attributes are recommended to simplify access). Errors in these attributes will create an inconsistency between the metadata and data of the granule or file. The risk of these 'inconsistency errors' is highest for files that are aggregated into longer or larger products, or subset into shorter or smaller products, such as files from numerical forecast models and gridded satellite observations. For this reason, some providers of those data types may choose to omit the ACDD geospatiotemporal attributes from their files. If the ACDD geospatiotemporal attributes are present, checking them against the CF coordinate variables can serve as a partial test of the metadata's validity.


Metadata Link

The netCDF metadata model is focused on providing "use metadata" for the data included in the file (or granule). Other metadata dialects (i.e. ISO 19115) can provide information about collections and more details about the dataset. In order to make users aware of that additional metadata we recommend adding a global attribute named "Metadata_Link" to the netCDF file. The value of this attribute is a URL that gives the location of the more complete metadata. This element is not included in the current version of the NetCDF Attribute Convention for Dataset Discovery.

Global Attributes

Highly Recommended

Attribute Description THREDDS ISO 19115-2 OGC CSW Rubric Category
title
A short phrase or sentence describing the dataset; this is a NetCDF Users Guide (NUG) attribute. dataset@name
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:title/gco:CharacterString (M)
Title Text Search
summary
A paragraph describing the dataset, analogous to an abstract for a paper. metadata/documentation[@type="summary"]
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:abstract/gco:CharacterString (M)
Abstract Text Search
keywords
A comma-separated list of key words and/or phrases. Keywords may be common words or phrases, terms from a controlled vocabulary (GCMD is often used), or URIs for terms from a controlled vocabulary (see also keywords_vocabulary attribute). metadata/keyword
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:keyword/gco:CharacterString
Subject Text Search
Conventions A list of the conventions followed by the dataset; blank space separated is recommended but commas should be used if any convention name contains blanks. For files that comply with this version of ACDD, include the term ACDD-1.2. This attribute is defined in NUG.

=Recommended=

Attribute Description THREDDS ISO 19115-2 OGC CSW Rubric Category
id An identifier for the data set, provided by and unique within its naming authority. The combination of the "naming authority" and the "id" should be globally unique, but the id can be globally unique by itself also. IDs can be URLs, URNs, DOIs, meaningful text strings, a local key, or any other unique string of characters. The id should not include blanks. dataset@id /gmi:MI_Metadata/gmd:fileIdentifier/gco:CharacterString (O) Identifier Identifier
naming_authority The organization that provides the initial id (see above) for the dataset. The naming authority should be uniquely specified by this attribute.
keywords_vocabulary If you are following a guideline for the words/phrases in your "keywords" attribute, put the name of that guideline here. metadata/keyword@vocabulary /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString Text Search
cdm_data_type The organization of the data, as derived from the Common Data Model's Scientific Data layer and understood by THREDDS (this is a THREDDS "dataType"). One of point, profile, section, station, station_profile, trajectory, grid, image, or swath. Please note that this is different from the CF NetCDF attribute 'featureType' that indicates a Discrete Sampling Geometry file - for guidance on those terms, please see this NODC guidance. metadata/dataType /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:spatialRepresentationType/gmd:MD_SpatialRepresentationTypeCode May need some extensions to this codelist. Current values: vector, grid, textTable, tin, stereoModel, video. Other
history Describes the processes/transformations used to create this data; can serve as an audit trail. This attribute is defined in the NUG: 'This is a character array with a line for each invocation of a program that has modified the dataset. Well-behaved generic netCDF applications should append a line containing: date, time of day, user name, program name and command arguments.' To include a more complete description you can append an ISO Lineage reference; see NOAA EDM ISO Lineage guidance. metadata/documentation[@type="history"] /gmi:MI_Metadata/gmd:dataQualityInfo/gmd:DQ_DataQuality/gmd:lineage/gmd:LI_Lineage/gmd:statement/gco:CharacterString (O) Text Search
source The method of production of the original data. If it was model-generated, source should name the model and its version. If it is observational, source should characterize it. This attribute is defined in CF.
comment Miscellaneous information about the data, not captured elsewhere. This attribute is defined in CF. metadata/documentation /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:supplementalInformation Text Search
date_created The date on which the data was created. metadata/date[@type="created"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:date/gmd:CI_Date/gmd:date/gco:Date (M) /gmd:dateType/gmd:CI_DateTypeCode="creation" Responsible Party
creator_name The data creator's name, URL, and email. The "institution" attribute will be used if the "creator_name" attribute does not exist. metadata/creator/name /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:individualName/gco:CharacterString CI_RoleCode="originator" (O) Responsible Party
creator_url metadata/creator/contact@url /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:onlineResource/gmd:CI_OnlineResource/gmd:linkage/gmd:URL Responsible Party
creator_email metadata/creator/contact@email /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString Responsible Party
institution metadata/creator/name /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:organisationName/gco:CharacterString Responsible Party
project The scientific project that produced the data. metadata/project /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:keyword/gco:CharacterString
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:aggregationInfo/gmd:MD_AggregateInformation/gmd:aggregateDataSetName/gmd:CI_Citation/gmd:title/gco:CharacterStringDS_AssociationTypeCode="largerWorkCitation" and DS_InitiativeTypeCode="project"
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:keyword/gco:CharacterString with gmd:MD_KeywordTypeCode="project"
Responsible Party
processing_level A textual description of the processing (or quality control) level of the data. metadata/documentation[@type="processing_level"]
acknowledgement A place to acknowledge various type of support for the project that produced this data. metadata/documentation[@type="funding"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:credit/gco:CharacterString Responsible Party
geospatial_bounds Describes geospatial extent using any of the geometric objects (2D or 3D) supported by the Well-Known Text (WKT) format. BoundingPolygon Extent
geospatial_lat_min Describes a simple latitude/longitude bounding box. geospatial_lat_min specifies the southernmost latitude; geospatial_lat_max specifies the northernmost latitude; geospatial_lon_min specifies the westernmost longitude; geospatial_lon_max specifies the easternmost longitude of the bounding box.
The values of geospatial_lon_min and geospatial_lon_max reflect the actual longitude data values. Cases where geospatial_lon_min is greater than geospatial_lon_max indicate the bounding box extends from geospatial_lon_max, through the longitude range discontinuity meridian (either the antimeridian or Prime Meridian), to geospatial_lon_min.
For a more detailed geospatial coverage, see the suggested geospatial attributes.
metadata/geospatialCoverage/northsouth/start /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:southBoundLatitude/gco:Decimal BoundingBox Extent
geospatial_lat_max metadata/geospatialCoverage/northsouth/size /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:northBoundLatitude/gco:Decimal BoundingBox Extent
geospatial_lon_min metadata/geospatialCoverage/eastwest/start /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:westBoundLongitude/gco:Decimal BoundingBox Extent
geospatial_lon_max metadata/geospatialCoverage/eastwest/size /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:eastBoundLongitude/gco:Decimal BoundingBox Extent
geospatial_vertical_min Describes a simple vertical bounding box. For a more detailed geospatial coverage, see the suggested geospatial attributes. metadata/geospatialCoverage/updown/start /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:verticalElement/gmd:EX_VerticalExtent/gmd:minimumValue/gco:Real Extent
geospatial_vertical_max metadata/geospatialCoverage/updown/size /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:verticalElement/gmd:EX_VerticalExtent/gmd:maximumValue/gco:Real Extent
time_coverage_start Describes the temporal coverage of the data as a time range. metadata/timeCoverage/start /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent/gml:TimePeriod/gml:beginPosition Extent
time_coverage_end metadata/timeCoverage/end /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent/gml:TimePeriod/gml:endPosition Extent
time_coverage_duration metadata/timeCoverage/duration /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent/gml:TimePeriod/gml:beginPosition provides an ISO8601 compliant description of the time period covered by the dataset. This standard supports descriptions of durations. Extent
time_coverage_resolution metadata/timeCoverage/resolution Extent
standard_name_vocabulary The name of the controlled vocabulary from which variable standard names are taken. metadata/variables@vocabulary /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString Text Search
license Describe the restrictions to data access and distribution. metadata/documentation[@type="rights"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:resourceConstraints/gmd:MD_LegalConstraints/gmd:useLimitation/gco:CharacterString

Suggested

Attribute Description THREDDS ISO 19115-2 OGC CSW Rubric Category
contributor_name
The name and role of any individuals or institutions that contributed to the creation of this data.
metadata/contributor
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:individualName/gco:CharacterString
Responsible Party
contributor_role
metadata/contributor@role /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:role/gmd:CI_RoleCode
="principalInvestigator" | "author"
Responsible Party
publisher_name
The data publisher's name, URL, and email. The publisher may be an individual or an institution. metadata/publisher/name
/gmi:MI_Metadata/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty/gmd:organisationName/gco:CharacterString
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:individualName/gco:CharacterString
CI_RoleCode="publisher"
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:keyword/gco:CharacterString with gmd:MD_KeywordTypeCode="dataCenter"
Responsible Party
publisher_url
metadata/publisher/contact@url
/gmi:MI_Metadata/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:onlineResource/gmd:CI_OnlineResource/gmd:linkage/gmd:URL
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:onlineResource/gmd:CI_OnlineResource/gmd:linkage/gmd:URL
CI_RoleCode="publisher"
Responsible Party
publisher_email
metadata/publisher/contact@email /gmi:MI_Metadata/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString
and/or
/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString
CI_RoleCode="publisher"
Responsible Party
date_modified
The date on which this data was last modified.
metadata/date[@type="modified"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:date/gmd:CI_Date/gmd:date/gco:Date
/gmd:dateType/gmd:CI_DateTypeCode="revision"
ModifiedResponsible Party
date_issued
The date on which this data was formally issued.
metadata/date[@type="issued"] /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:date/gmd:CI_Date/gmd:date/gco:Date
/gmd:dateType/gmd:CI_DateTypeCode="publication"
Responsible Party
geospatial_lat_units
Further refinement of the geospatial bounding box can be provided by using these units and resolution attributes.
metadata/geospatialCoverage/northsouth/units /gmi:MI_Metadata/gmd:spatialRepresentationInfo/gmd:MD_Georectified/gmd:axisDimensionProperties/gmd:MD_Dimension/gmd:resolution/gco:Measure/@uom Extent
geospatial_lat_resolution metadata/geospatialCoverage/northsouth/resolution /gmi:MI_Metadata/gmd:spatialRepresentationInfo/gmd:MD_Georectified/gmd:axisDimensionProperties/gmd:MD_Dimension/gmd:resolution/gco:Measure Extent
geospatial_lon_units
metadata/geospatialCoverage/eastwest/units /gmi:MI_Metadata/gmd:spatialRepresentationInfo/gmd:MD_Georectified/gmd:axisDimensionProperties/gmd:MD_Dimension/gmd:resolution/gco:Measure/@uom Extent
geospatial_lon_resolution metadata/geospatialCoverage/eastwest/resolution /gmi:MI_Metadata/gmd:spatialRepresentationInfo/gmd:MD_Georectified/gmd:axisDimensionProperties/gmd:MD_Dimension/gmd:resolution/gco:Measure Extent
geospatial_vertical_units
metadata/geospatialCoverage/updown/units /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:verticalElement/gmd:EX_VerticalExtent/gmd:verticalCRS Extent
geospatial_vertical_resolution
metadata/geospatialCoverage/updown/resolution
Extent
geospatial_vertical_positive
metadata/geospatialCoverage@zpositive
Extent

Highly Recommended Variable Attributes

Attribute Description THREDDS ISO 19115-2
long_name A long descriptive name for the variable (not necessarily from a controlled vocabulary). metadata/variables/variable@vocabulary_name At present the ISO 19115-2 Standard supports only one name for a variable. Standard names can be provided as keywords with the appropriate thesaurus.
standard_name
A long descriptive name for the variable taken from a controlled vocabulary of variable names. metadata/variables/variable@vocabulary_name
units The units of the variables data values. This attributes value should be a valid udunits string. metadata/variables/variable@units /gmi:MI_Metadata/gmd:contentInfo/gmi:MI_CoverageDescription/gmd:dimension/gmd:MD_Band/gmd:units
coverage_content_type An ISO 19115-1 code to indicate the source of the data. The valid values in the MD_CoverageContentTypeCode list are image, thematicClassification, physicalMeasurement, auxiliaryInformation, qualityInformation, referenceInformation, modelResult, coordinate


Translation Revisions

Determining an Order of Precedence