Attribute Convention for Data Discovery 1-2 Working

From Federation of Earth Science Information Partners
Revision as of 12:17, 31 January 2014 by NanGalbraith (Talk | contribs)

Jump to: navigation, search

Contents

Version and Status

This version is designated as Version 1.2.1 beta.

This page is under development with updated definitions.

Introduction

This page consolidates ongoing work seeking to improve the definitions in the Attribute Convention for Data Discovery (ACDD).

The first 3 sections represent the terms in the corresponding sections of the ACDD.

Modifications relative to the original text may be seen with the history mechanism of this wiki. The original definitions are marked with the Summary keyword Original Definitions.

Process

The edits will be made in this page by anyone in the community who wishes to contribute, and discussed in greater depth in the Discussion page, if necessary. (The discussion page can also be used as an archive of changes on this page, if desired.)

Once there is some consensus about one or a group of definitions, they can be migrated to the primary document and the version number of that document incremented.

Status

This summarizes the status of the terms as of 2013.10.18.

It also references the open issues in the Discussion page.

These are the major remaining issues in the document.

Attributes Without Comment

Highly Recommended: title, summary

Recommended: id, naming_authority, comment, processing_level, acknowledgment, geospatial_* (bounds, lat_min, lat_max, lon_min, lon_max, vertical_min, vertical_max, vertical_positive), time_coverage_start, time_coverage_end, time_coverage_duration, license (wording reordered) Suggested: geospatial_lat_units, geospatial_lon_units, geospatial_vertical_units, coverage_content_type

Attributes Discussed and Resolved

Recommended:

  • cdm_data_type: all issues resolved, needs one last read.
  • creator, creator_email, publisher, publisher_email: no issue with updates
  • time_coverage_resolution: updated to specify targeted spacing (and preferred format); needs review
  • standard_name_vocabulary: someone pointed out this is unnecessary; in CF the standard_name vocabulary is always CF. It's deleted.
  • contributor_info: principal objections (ISO 19139) are resolved; discussion may be needed, but I think satisfactory structural encodings may be found and should be acceptable.

Suggested:

  • geospatial_*_resolution (lat, lon, vertical): updated to specify targeted spacing; needs review

Attributes Under Discussion

Highly Recommended:

  • keywords: use type code or pseudo-groups syntax? ok to use URI in addition to selections from a vocabulary? ok to use prefix?

Recommended:

  • keywords_vocabulary: can multiple keyword vocabularies be separated by a comma and specified in keywords attribute with a prefix? (if not both, then do neither -- just use URI option in keywords)
  • history: had to drop ISO 19139 expression of lineage, replaced with external reference option
  • date_modified: recently discussed by Nan; description is updated per John's latest email in that thread
  • creator_url, publisher url: moved to Suggested, changed to _uri, and specified to apply to person only

Suggested:

  • creator_project, creator_institution, publisher_project, publisher_institution: do they help discovery enough to include?
  • creator_project_info, creator_institution_info, publisher_project_info, publisher_institution_info: (deleted ISO 19139: do _they_ help discovery enough?
  • date_created: recently discussed by Nan; description is updated from John's latest email in that thread

Other:

  • Metadata_Conventions: changed text significantly per separate email thread; reference John's email titled Metadata_Conventions and Metadata_Link
  • Metadata_Link: defined

Working Definitions

Highly Recommended

title 
A short phrase or sentence describing the dataset; this is a NetCDF Users Guide (NUG) attribute.
summary 
A paragraph describing the dataset, analogous to an abstract for a paper.
keywords 
A comma-separated list of key words and/or phrases. Keywords may be common words or phrases, terms from a controlled vocabulary, or URIs for terms from a controlled vocabulary (see keyword_vocabulary below).
Conventions 
A list of the conventions followed by the dataset; blank space separated is recommended but commas should be used if any convention name contains blanks. This attribute is defined in NUG.
source 
The method of production of the original data. If it was model-generated, source should name the model and its version. If it is observational, source should characterize it. This attribute is defined in (CF).

Recommended

id 
An identifier for the data set, provided by and unique within its naming authority. The combination of the "naming authority" and the "id" should be globally unique, but the id can be globally unique by itself also. IDs can be URLs, URNs, DOIs, meaningful text strings, a local key, or any other unique string of characters. The id should not include blanks.
naming_authority 
The organization that provides the initial id (see above) for the dataset. The naming authority should be uniquely specified by this attribute.
keywords_vocabulary 
If you are using a controlled vocabulary for the words/phrases in your "keywords" attribute, the unique name or identifier of the vocabulary from which keywords are taken. If more than one keyword vocabulary is used, each may be presented with a prefix (e.g., "CF:NetCDF COARDS Climate and Forecast Standard Names") and a following comma, so that keywords may optionally be prefixed with the controlled vocabulary key.
cdm_data_type 
The organization of the data, as derived from the Common Data Model's Scientific Data layer and understood by THREDDS (this is a THREDDS "dataType"). One of point, profile, section, station, station_profile, trajectory, grid, image, or swath. Please note that this is different from the CF NetCDF attribute 'featureType' that indicates a Discrete Sampling Geometry file - for guidance on those terms, please see this NODC guidance.
history 
Describes the processes/transformations used to create this data; can serve as an audit trail. Per the NUG: 'This is a character array with a line for each invocation of a program that has modified the dataset. Well-behaved generic netCDF applications should append a line containing: date, time of day, user name, program name and command arguments.' To include a more complete description you can append an ISO Lineage reference; see NOAA EDM ISO Lineage guidance.
comment 
Miscellaneous information about the data, not captured elsewhere.
date_modified 
The date on which the provided content, including data, metadata, and presented format, was last changed.
creator  
The name of the person principally responsible for originating this data.
creator_email 
The email address of the person principally responsible for the data in the file.
publisher 
The person responsible for the data file, its metadata and format.
publisher_email 
The email address of the person responsible for the data file, its metadata and format.
processing_level 
A textual description of the processing (or quality control) level of the data.
acknowledgement 
A place to acknowledge various type of support for the project that produced this data.
geospatial_bounds 
Describes geospatial extent using any of the geometric objects (2D or 3D) supported by the Well-Known Text (WKT) format.
geospatial_lat_min 
Describes a simple lower latitude limit; may be part of a bounding box or cube. Geospatial_lat_min specifies the southernmost latitude covered by the dataset.
geospatial_lat_max 
Describes a simple upper latitude limit; may be part of a bounding box or cube. Geospatial_lat_max specifies the northernmost latitude covered by the dataset.
geospatial_lon_min 
Describes a simple longitude limit; may be part of a bounding box or cube. Geospatial_lon_min specifies the westernmost longitude covered by the dataset. Cases where geospatial_lon_min is greater than geospatial_lon_max indicate the bounding box extends from geospatial_lon_max, through the longitude range discontinuity meridian (either the antimeridian for -180:180 values, or Prime Meridian for 0:360 values), to geospatial_lon_min.
geospatial_lon_max 
Describes a simple longitude limit; may be part of a bounding box or cube. Geospatial_lon_max specifies the easternmost longitude covered by the dataset. Cases where geospatial_lon_min is greater than geospatial_lon_max indicate the bounding box extends from geospatial_lon_max, through the longitude range discontinuity meridian (either the antimeridian for -180:180 values, or Prime Meridian for 0:360 values), to geospatial_lon_min.
geospatial_vertical_min 
Describes a numerically smaller vertical limit; may be part of a bounding box or cube. If geospatial_vertical_positive is up ('altitude' orientation), the geospatial_vertical_min attribute specifies the location closest to the earth's center covered by the dataset. If geospatial_vertical_positive is down ('depth' orientation), the geospatial_vertical_min attribute specifies the location furthest from the earth's center covered by the dataset.
geospatial_vertical_max 
Describes a numerically larger vertical limit; may be part of a bounding box or cube. If geospatial_vertical_positive is up ('altitude' orientation), the geospatial_vertical_min attribute specifies the location furthest from the earth's center covered by the dataset. If geospatial_vertical_positive is down ('depth' orientation), the geospatial_vertical_min attribute specifies the location closest to the earth's center covered by the dataset.
geospatial_vertical_positive 
One of 'up' or 'down'. If up, vertical values are interpreted as 'altitude', with negative values corresponding to below the reference datum (e.g., under water). If down, vertical values are interpreted as 'depth', positive values correspond to below the reference datum.
time_coverage_start 
Describes the time of the first data point in the data set. ISO8601 format recommended.
time_coverage_end 
Describes the time of the last data point in the data set. ISO8601 format recommended.
time_coverage_duration 
Describes the duration of the data set. ISO8601 duration format recommended.
time_coverage_resolution 
Describes the targeted time period between each value in the data set. ISO8601 duration format recommended.
license 
Provide the URL to a standard or specific license, enter "Freely Distributed" or "None", or describe any restrictions to data access and distribution in free text.

Suggested

contributor_info 
The name and role of any individuals, projects, or institutions that contributed to the creation of this data. May be presented as free text, or in a structured format compatible with conversion to ncML (e.g., insensitive to whitespace).
date_created 
The date on which this data product came into existence (for products that grow by adding data, this value isn't changed by later additions of data).
geospatial_lat_units 
Units for the latitude axis. These are presumed to be "degree_north"; other options from udunits may be specified instead.
geospatial_lat_resolution 
Information about the targeted spacing of points in latitude. (Format is not prescribed.)
geospatial_lon_units 
Units for the longitude axis. These are presumed to be "degree_east"; other options from udunits may be specified instead.
geospatial_lon_resolution 
Information about the targeted spacing of points in longitude. (Format is not prescribed.)
geospatial_vertical_units 
Units for the vertical axis. These are presumed to be "meter" (of depth); other options from udunits may be specified. Note that the common oceanographic practice of using pressure for a vertical coordinate, while not strictly a depth, can be specified using the unit bar.
geospatial_vertical_resolution 
Information about the targeted vertical spacing of points.
coverage_content_type 
Information about the content of the file, valid values are image, thematicClassification, physicalMeasurement, auxiliaryInformation, qualityInformation, referenceInformation, modelResult, coordinate.
creator_uri 
The unique identifier of the person principally responsible for the data.
creator_institution 
The institution that produced the data; should uniquely identify the institution.
creator_institution_info 
Additional free text information for the institution that produced the data.
creator_project 
The scientific project that produced the data; should uniquely identify the project.
creator_project_info 
Additional free text information for the institution that produced the data.
publisher_uri 
The unique identifier of the person responsible for the data file, its metadata and format.
publisher_institution 
The institution that published the data file; should uniquely identify the institution.
publisher_institution_info 
Additional information for the institution that published the data; can include any information as ISO 19139 or free text.
publisher_project 
The scientific project that published the data; should uniquely identify the project.
publisher_project_info 
Additional information for the institution that published the data; can include any information as ISO 19139 or free text.
metadata_link 
A URI that gives the location of more complete metadata; a URL is recommended.
Metadata_Convention 
(deprecated, supported for backward compatibility with current usage) Reference to the particular metadata convention(s) used for the described data file; recommended practice is to add the metadata convention(s) to the comma-delimited conventions list in the 'Conventions' attribute, per NetCDF Best Practices.

Note: The NUG defines title and history to be global attributes. CF adds institution, source, references, and comment, to be either global or assigned to individual variables. When an attribute appears both globally and as a variable attribute, the variable's version has precedence. ACDD does not require or define institution or references.


Mappings ACDD to other metadata dialects

http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Mappings

Recommended Order of Precedence

http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Precedence

Future Directions: Object Conventions for Data Discovery

http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Object_Conventions

ISO Translation Notes

http://wiki.esipfed.org/index.php?title=Attribute_Convention_for_Data_Discovery_(ACDD)_ISO_TranslationNotes

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox