Documentation Terminology

From Federation of Earth Science Information Partners

Established metadata terminology is the result of a multi-decade, cooperative effort between metadata experts in NOAA, NASA and other U.S. Federal Agencies. The selection below is intended to provide a framework of basic terminology in order to facilitate understanding of fundamental concepts. More in-depth information and application examples are available in Habermann, 2014.

Terms

Collection A group of metadata records commonly organized by a data facility, organization or project and often stored in a database or web accessible folder.
Concept General term for describing a documentation entity. Concepts are independent of dialect so a single concept can occur in many dialects. They are typically represented (in XML) by an element or a collection of elements.
Dialect A particular representation of metadata that is specific to a community. Examples include Content Standard for Digital Geographic metadata (CSDGM), Directoy Interchange Format (DIF), ISO 19115-3 (XML representation of ISO 19115-1 conceptual model).
Dialect Maximum The number of concepts from a particular recommendation that are included in a particular dialect and, therefore, the maximum number of concepts from that recommendation that can be represented in the dialect. Note: the dialect maximum is always less than or equal to the number of concepts included in the recommendation (recommendation maximum).
Documentation The complete collection of unstructured written, drawn, presented or recorded materials necessary for discovering, accessing, understanding, and reproducing scientific data and results.
Element An item providing a value for a concept, typically in an XML representation. Elements depend on dialects. They are the instantiation of a concept in a dialect.
Level Recommendations may have different degrees of necessity associated with a concept's occurrence in a record e.g. mandatory, recommended, and suggested. These subsets of concepts within a recommendation are called levels.
Metadata Structured and standardized elements of scientific documentation
Recommendation A set of concepts that an organization identifies for achieving a documentation goal.
Recommendation Maximum The number of concepts included in a recommendation. Note that the recommendation maximum is the maximum completeness score available for a metadata record being evaluated with respect to that recommendation. The recommendation maxima are always greater than or equal to all dialect maxima for that recommendation.
Signature A series of numbers that give the number of concepts/elements missing from a metadata record (or a group of metadata records) in a series of spirals. Signatures with low numbers indicate fewer missing elements and a signature made up completely of 0's indicates a record or group of records that is complete with respect to a particular recommendation/dialect combination. A signature of 2 3 indicates that 2 elements are missing from the first spiral and 3 are missing from the second. The sum of the numbers in a signature is the total number of elements missing from a record or group of records.
Spiral A set of concepts required to support a particular documentation need or use case.

Notes

Concepts

Concepts can be described at a general level or include more detailed information, e.g. “Spatial Extent” is a high level metadata concept that can be addressed in a general manner; or it can include more detailed concepts like bounding latitude/longitude box or geographic identifiers.

Documentation

  • Examples include but are not limited to: Notebooks, scientific papers, web pages, user guides, word processing documents, spreadsheets, data dictionaries, PDF’s, custom binary and ASCII formats, and many others — each with associated storage and preservation strategies.

More often than not, the scientific process is documented, stored, and circulated using different tools and approaches depending on the needs of an exclusive group within the scientific community. This customized, often unstructured approach may work well for independent investigators or in the confines of a particular community; but for users outside of these small groups, it creates significant complications with discovering, accessing, using, and understanding (Space keeper - link to Metadata Recommendations – Background) for an explanation of these 4 processes) the data.

Spirals

Visual Depiction of the Spiral Model

Spiral Model: Like any language, metadata dialects are living entities that must evolve and expand in response to newly developed requirements within user communities. This constantly escalating effort inherently introduces increasing levels of complexity. To promote progress and metadata improvement, a Spiral model that utilizes small, actionable iterations is employed. Following this model, communities are able to improve their metadata over time.

The metadata improvement process is divided into a series of steps that can be defined and accomplished in a clear and orderly manner.

Each loop in the spiral represents a concept that is divided into 4 quadrants consisting of:

  • Determine Objectives
  • Identify and Resolve Risks
  • Development and Testing
  • Plan Next Iteration

There is no limit to the # of loops that can be added.

Rubrics

Use of a Rubric is one method to assess the completeness of a spiral.

As discussed above, the metadata improvement process is orchestrated through a series of spirals. In order to measure progress toward improvement goals and characterize the task to be done, the state of the metadata record in the improvement process (spiral) must be quantitatively and consistently evaluated. Once a set of spirals for a particular dialect are defined, this can be accomplished through use of a rubric. The rubric provides a description of the state of a record by arranging the spirals as rows in a table with degree of completeness shown as columns. As the record becomes more complete, the rubric score increases.

Recommendations

Historically, metadata content has been approached in a variety of ways depending on the needs of specific user communities. This resulted in the development of multiple metadata “dialects” that must evolve and improve as metadata needs change. The metadata improvement process is defined and depicted using spirals and the success of the spirals are quantitatively and consistently evaluated using rubrics. Recommendations are the conclusion to this evaluation process. They are the metadata elements that are required, recommended, or suggested for a particular community need. Publishing this information in the form of recommendation lists eliminates the need to “reinvent the wheel” and therefore facilitates maximum output for minimum effort. Along those same lines, it should also be noted that recommendations are a subset of a given dialect’s capabilities necessary to satisfy the specific needs of a particular user community. By employing only what is necessary to accomplish the task at hand – time, effort, and energy are significantly conserved.