Difference between revisions of "Documentation Terminology"

From Earth Science Information Partners (ESIP)
Line 18: Line 18:
  
  
'''Metadata''': A structured framework that describes well-defined contextualizing information about the data  
+
'''Metadata''': A structured framework that describes well-defined contextualizing information about the data.
  
 
This structured, consistent approach facilitates discovery, access, use, and interpretability of datasets by users outside the investigation group.  It also enables integration of metadata into discovery and analysis tools and provides consistent references from the metadata to external documentation.
 
This structured, consistent approach facilitates discovery, access, use, and interpretability of datasets by users outside the investigation group.  It also enables integration of metadata into discovery and analysis tools and provides consistent references from the metadata to external documentation.
Line 31: Line 31:
  
 
While there are difference between metadata “standards,” it should be noted that these seemingly unique “standards” also significantly overlap – as the “who, where, when, why, and how” must always be addressed, regardless of the community approach.  Thus, in reality, these standards are more akin to dialects of a universal documentation language than multiple, disparate languages.  As such, for the purposes of this work, the term “metadata dialect” will be substituted for “metadata standard” to promote the understanding that these are different ways of expressing the same ideas.
 
While there are difference between metadata “standards,” it should be noted that these seemingly unique “standards” also significantly overlap – as the “who, where, when, why, and how” must always be addressed, regardless of the community approach.  Thus, in reality, these standards are more akin to dialects of a universal documentation language than multiple, disparate languages.  As such, for the purposes of this work, the term “metadata dialect” will be substituted for “metadata standard” to promote the understanding that these are different ways of expressing the same ideas.
 
  
 
== SPIRALS: ==
 
== SPIRALS: ==

Revision as of 08:32, June 25, 2015

Established metadata terminology is the result of a multi-decade, cooperative effort between metadata experts in NOAA, NASA and other U.S. Federal Agencies. The selection below is intended to provide a framework of basic terminology in order to facilitate understanding of fundamental metadata concepts. More in-depth information and application examples are available in Habermann, 2014.


METADATA CONCEPTS:

A metadata concept is a way of describing contextual information – independent of dialect.

It is these concepts within the metadata that make the data discoverable, accessible, and usable. Metadata concepts can be described at a general level or include more detailed information.

Ex: “Spatial Extent” is a high level metadata concept that can be addressed in a general manner; or it can include more detailed concepts like bounding latitude/longitude box or geographic identifiers.


DIALECTS:

Documentation: A set of unstructured written, drawn, presented or recorded representations of thought(s).

  • Examples include but are not limited to: Notebooks, scientific papers, web pages, user guides, word processing documents, spreadsheets, data dictionaries, PDF’s, custom binary and ASCII formats, and many others — each with associated storage and preservation strategies.

More often than not, the scientific process is documented, stored, and circulated using different tools and approaches depending on the needs of an exclusive group within the scientific community. This customized, often unstructured approach may work well for independent investigators or in the confines of a particular community; but for users outside of these small groups, it creates significant complications with discovering, accessing, using, and understanding (Space keeper - link to Metadata Recommendations – Background) for an explanation of these 4 processes) the data.


Metadata: A structured framework that describes well-defined contextualizing information about the data.

This structured, consistent approach facilitates discovery, access, use, and interpretability of datasets by users outside the investigation group. It also enables integration of metadata into discovery and analysis tools and provides consistent references from the metadata to external documentation.


Metadata Standards: Community developed, structured approaches to expressing concepts.

Metadata content can be approached in a variety of “dialects,” depending on the needs of specific user communities. Thus, exclusive communities have developed their own standard approaches to describing data for discovery, accessibility, usability and understandability purposes. The results of these efforts are frequently referred to as “metadata standards,” which vary from group to group.


Metadata Dialects: A collection of similar and connecting standards that represent variations of a universal documentation language.

While there are difference between metadata “standards,” it should be noted that these seemingly unique “standards” also significantly overlap – as the “who, where, when, why, and how” must always be addressed, regardless of the community approach. Thus, in reality, these standards are more akin to dialects of a universal documentation language than multiple, disparate languages. As such, for the purposes of this work, the term “metadata dialect” will be substituted for “metadata standard” to promote the understanding that these are different ways of expressing the same ideas.

SPIRALS:

Visual Depiction of the Spiral Model

Spiral Model: Like any language, metadata dialects are living entities that must evolve and expand in response to newly developed requirements within user communities. This constantly escalating effort inherently introduces increasing levels of complexity. To promote progress and metadata improvement, a Spiral model that utilizes small, actionable iterations is employed. Following this model, communities are able to improve their metadata over time.

The metadata improvement process is divided into a series of steps that can be defined and accomplished in a clear and orderly manner.

Each loop in the spiral represents a concept that is divided into 4 quadrants consisting of:

  • Determine Objectives
  • Identify and Resolve Risks
  • Development and Testing
  • Plan Next Iteration

There is no limit to the # of loops that can be added.


Spirals: Spirals are collections of concepts needed to address specific use cases or requirements. The following is an example of one way they can be enlarged:

Place holder for Table


RUBRICS:

Use of a Rubric is one method to assess the completeness of a spiral.

As discussed above, the metadata improvement process is orchestrated through a series of spirals. In order to measure progress toward improvement goals and characterize the task to be done, the state of the metadata record in the improvement process (spiral) must be quantitatively and consistently evaluated. Once a set of spirals for a particular dialect are defined, this can be accomplished through use of a rubric. The rubric provides a description of the state of a record by arranging the spirals as rows in a table with degree of completeness shown as columns. As the record becomes more complete, the rubric score increases.


RECOMMENDATIONS:

Historically, metadata content has been approached in a variety of ways depending on the needs of specific user communities. This resulted in the development of multiple metadata “dialects” that must evolve and improve as metadata needs change. The metadata improvement process is defined and depicted using spirals and the success of the spirals are quantitatively and consistently evaluated using rubrics. Recommendations are the conclusion to this evaluation process. They are the metadata elements that are required, recommended, or suggested for a particular community need. Publishing this information in the form of recommendation lists eliminates the need to “reinvent the wheel” and therefore facilitates maximum output for minimum effort. Along those same lines, it should also be noted that recommendations are a subset of a given dialect’s capabilities necessary to satisfy the specific needs of a particular user community. By employing only what is necessary to accomplish the task at hand – time, effort, and energy are significantly conserved.


Section 1 - Let's Start at the Beginning