Attribute Convention for Data Discovery 1-2 Working
Version and Status
This version is designated as Version 1.2.2.
This page is under development with updated definitions.
Introduction
This page consolidates ongoing work seeking to improve the definitions in the Attribute Convention for Data Discovery (ACDD).
The first 3 sections represent the terms in the corresponding sections of the ACDD.
Modifications relative to the original text may be seen with the history mechanism of this wiki. The original definitions are marked with the Summary keyword Original Definitions.
Process
The edits have been made in this page by anyone in the community who wishes to contribute, and discussed in greater depth in the Discussion page, if necessary.
Once there is some consensus about a group of definitions, they will be migrated to the primary document and the version number of that document incremented.
Status
This summarizes the status of the terms as of 2014.02.03. All major issues have been resolved in the document, pending review by the ACDD team.
Details may be reviewed below the open issues in the Discussion page.
Suggested Changes to introductory words
The following (between § marks) is proposed to replace the text on the 'current version' page, from section Development to just before Highly Recommended.
§
Development
Any development version of the ACDD definitions is maintained at Attribute_Convention_for_Data_Discovery_(ACDD)_Working.
Overview
The NetCDF Group at Unidata has recommended attributes for data discovery . The Attribute Convention for Data Discovery (ACDD) addresses that need, providing definitions for NetCDF global attributes that will help data to be located efficiently.
Alignment with NetCDF and CF Conventions
The NetCDF User Guide (NUG) provides basic recommendations for creating NetCDF files; the NetCDF Climate and Forecast Metadata Conventions (CF) provide more specific guidance. The ACDD builds upon and is compatible with these conventions; it may refine the definition of some terms in those conventions, but does not preclude the use of any attributes defined by the NUG or CF.
The NUG does not require any global attributes, though it recommends and defines two, title and history; CF specifies many more. ACDD adopts all CF global attributes, with the exception of institution; we specify creator_institution and publisher_institution to allow more information about the data to be included.
Attribute Crosswalks
Many of these attributes correspond to general discovery metadata content, so they are available in many metadata standards. This page includes the Unidata crosswalk to THREDDS and adds the crosswalk to ISO 19115-2. Note that the attribute names link to the Unidata definitions. Many of these elements are included in the ISO 19115 Core specification. They are indicated in this Table by an M, O, or C in parentheses. An “M” indicates that the element is mandatory. An “O” indicates that the element is optional. A “C” indicates that the element is mandatory under certain conditions.
Additional Metadata: metadata_link attribute
Other metadata dialects (i.e. ISO 19115) can provide information about collections and more details about the dataset. In order to make users aware of that additional metadata we recommend adding a global attribute named "metadata_link" to the netCDF file. The value of this attribute is a URL that gives the location of the more complete metadata. This element is not included in the current version of the NetCDF Attribute Convention for Dataset Discovery.
Conformance Test
A Conformance Test is available for this convention.
Global Attributes
(reformat Highly Recommended, Recommended, etc. as 2nd-level headings)
§
Highly Recommended
- title
- A short phrase or sentence describing the dataset; this is a NetCDF Users Guide (NUG) attribute.
- summary
- A paragraph describing the dataset, analogous to an abstract for a paper.
- keywords
- A comma-separated list of key words and/or phrases. Keywords may be common words or phrases, terms from a controlled vocabulary (GCMD is often used), or URIs for terms from a controlled vocabulary (see also keywords_vocabulary attribute).
- Conventions
- A list of the conventions followed by the dataset; blank space separated is recommended but commas should be used if any convention name contains blanks. This attribute is defined in NUG.
Recommended
- id
- An identifier for the data set, provided by and unique within its naming authority. The combination of the "naming authority" and the "id" should be globally unique, but the id can be globally unique by itself also. IDs can be URLs, URNs, DOIs, meaningful text strings, a local key, or any other unique string of characters. The id should not include blanks.
- naming_authority
- The organization that provides the initial id (see above) for the dataset. The naming authority should be uniquely specified by this attribute.
- cdm_data_type
- The organization of the data, as derived from the Common Data Model's Scientific Data layer and understood by THREDDS (this is a THREDDS "dataType"). One of point, profile, section, station, station_profile, trajectory, grid, image, or swath. Please note that this is different from the CF NetCDF attribute 'featureType' that indicates a Discrete Sampling Geometry file - for guidance on those terms, please see this NODC guidance.
- history
- Describes the processes/transformations used to create this data; can serve as an audit trail. Per the NUG: 'This is a character array with a line for each invocation of a program that has modified the dataset. Well-behaved generic netCDF applications should append a line containing: date, time of day, user name, program name and command arguments.' To include a more complete description you can append an ISO Lineage reference; see NOAA EDM ISO Lineage guidance. This attribute is defined in NUG.
- source
- The method of production of the original data. If it was model-generated, source should name the model and its version. If it is observational, source should characterize it. This attribute is defined in CF.
- comment
- Miscellaneous information about the data, not captured elsewhere. This attribute is defined in CF.
- date_content_modified
- The date on which any of the provided content, including data, metadata, and presented format, was last changed (ISO 8601 format)
- date_values_modified
- The date on which the provided data values were last changed; excludes metadata and formatting changes (ISO 8601 format)
- creator
- The name of the person principally responsible for originating this data.
- creator_email
- The email address of the person principally responsible for originating this data.
- publisher
- The person responsible for the data file or product, with its current metadata and format.
- publisher_email
- The email address of the person responsible for the data file or product.
- processing_level
- A textual description of the processing (or quality control) level of the data.
- acknowledgement
- A place to acknowledge various type of support for the project that produced this data.
- geospatial_bounds
- Describes geospatial extent using any of the geometric objects (2D or 3D) supported by the Well-Known Text (WKT) format.
- geospatial_lat_min
- Describes a simple lower latitude limit; may be part of a bounding box or cube. Geospatial_lat_min specifies the southernmost latitude covered by the dataset.
- geospatial_lat_max
- Describes a simple upper latitude limit; may be part of a bounding box or cube. Geospatial_lat_max specifies the northernmost latitude covered by the dataset.
- geospatial_lon_min
- Describes a simple longitude limit; may be part of a bounding box or cube. Geospatial_lon_min specifies the westernmost longitude covered by the dataset. Cases where geospatial_lon_min is greater than geospatial_lon_max indicate the bounding box extends from geospatial_lon_max, through the longitude range discontinuity meridian (either the antimeridian for -180:180 values, or Prime Meridian for 0:360 values), to geospatial_lon_min.
- geospatial_lon_max
- Describes a simple longitude limit; may be part of a bounding box or cube. Geospatial_lon_max specifies the easternmost longitude covered by the dataset. Cases where geospatial_lon_min is greater than geospatial_lon_max indicate the bounding box extends from geospatial_lon_max, through the longitude range discontinuity meridian (either the antimeridian for -180:180 values, or Prime Meridian for 0:360 values), to geospatial_lon_min.
- geospatial_vertical_min
- Describes a numerically smaller vertical limit; may be part of a bounding box or cube. If geospatial_vertical_positive is up ('altitude' orientation), the geospatial_vertical_min attribute specifies the location closest to the earth's center covered by the dataset. If geospatial_vertical_positive is down ('depth' orientation), the geospatial_vertical_min attribute specifies the location furthest from the earth's center covered by the dataset.
- geospatial_vertical_max
- Describes a numerically larger vertical limit; may be part of a bounding box or cube. If geospatial_vertical_positive is up ('altitude' orientation), the geospatial_vertical_min attribute specifies the location furthest from the earth's center covered by the dataset. If geospatial_vertical_positive is down ('depth' orientation), the geospatial_vertical_min attribute specifies the location closest to the earth's center covered by the dataset.
- geospatial_vertical_positive
- One of 'up' or 'down'. If up, vertical values are interpreted as 'altitude', with negative values corresponding to below the reference datum (e.g., under water). If down, vertical values are interpreted as 'depth', positive values correspond to below the reference datum.
- time_coverage_start
- Describes the time of the first data point in the data set. ISO8601 format recommended.
- time_coverage_end
- Describes the time of the last data point in the data set. ISO8601 format recommended.
- time_coverage_duration
- Describes the duration of the data set. ISO8601 duration format recommended.
- time_coverage_resolution
- Describes the targeted time period between each value in the data set. ISO8601 duration format recommended.
- license
- Provide the URL to a standard or specific license, enter "Freely Distributed" or "None", or describe any restrictions to data access and distribution in free text.
Suggested
The following terms and definitions are offered in case they address your situation.
- contributor_info
- The name and role of any individuals, projects, or institutions that contributed to the creation of this data. May be presented as free text, or in a structured format compatible with conversion to ncML (e.g., insensitive to whitespace).
- date_product_generated
- The date on which this data file or product was produced/distributed (ISO 8601 format). While this date is like a file timestamp, the date_content_modified and date_values_modified should be used to assess the age of the contents of the file or product.
- geospatial_lat_units
- Units for the latitude axis. These are presumed to be "degree_north"; other options from udunits may be specified instead.
- geospatial_lat_resolution
- Information about the targeted spacing of points in latitude. (Format is not prescribed.)
- geospatial_lon_units
- Units for the longitude axis. These are presumed to be "degree_east"; other options from udunits may be specified instead.
- geospatial_lon_resolution
- Information about the targeted spacing of points in longitude. (Format is not prescribed.)
- geospatial_vertical_units
- Units for the vertical axis. These are presumed to be "meter" (of depth); other options from udunits may be specified. Note that the common oceanographic practice of using pressure for a vertical coordinate, while not strictly a depth, can be specified using the unit bar.
- geospatial_vertical_resolution
- Information about the targeted vertical spacing of points.
- creator_uri
- The unique identifier of the person principally responsible for originating this data.
- creator_institution
- The institution that originated this data; should uniquely identify the institution.
- creator_institution_info
- Additional free text information for the institution that originated this data.
- creator_project
- The scientific project that originated this data; should uniquely identify the project.
- creator_project_info
- Additional free text information for the institution that originated this data.
- publisher_uri
- The unique identifier of the person responsible for providing the data file or product.
- publisher_institution
- The institution that provided the data file or equivalent product; should uniquely identify the institution.
- publisher_institution_info
- Additional information for the institution that provided the data file or equivalent product; can include any information as free text, or in a structured format compatible with conversion to ncML (e.g., insensitive to whitespace).
- publisher_project
- The scientific project that provided the data file or equivalent product; should uniquely identify the project.
- publisher_project_info
- Additional information for the institution that provided the data file or equivalent product; can include any information as free text, or in a structured format compatible with conversion to ncML (e.g., insensitive to whitespace).
- keywords_vocabulary
- If you are using a controlled vocabulary for the words/phrases in your "keywords" attribute, this is the unique name or identifier of the vocabulary from which keywords are taken. If more than one keyword vocabulary is used, each may be presented with a prefix (e.g., "CF:NetCDF COARDS Climate and Forecast Standard Names") and a following comma, so that keywords may optionally be prefixed with the controlled vocabulary key.
- metadata_link
- A URI that gives the location of more complete metadata; a URL is recommended.
Deprecated
The following terms and definitions are still in the specification, but are no longer recommended for use.
- Metadata_Convention
- (deprecated, supported for backward compatibility with current usage) Reference to the particular metadata convention(s) used for the described data file; recommended practice is to add the metadata convention(s) to the comma-delimited conventions list in the 'Conventions' attribute, per NetCDF Best Practices.
Additional Materials
Mappings ACDD to other metadata dialects
http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Mappings
Recommended Order of Precedence
http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Precedence