Difference between revisions of "Documentation Terminology"

From Earth Science Information Partners (ESIP)
 
(48 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Established metadata terminology is the result of a multi-decade, cooperative effort between metadata experts in NOAA, NASA and other U.S. Federal Agencies.  The selection below is intended to provide a framework of basic terminology in order to facilitate understanding of fundamental metadata concepts.  More in-depth information and application examples are available in [http://figshare.com/articles/Metadata_Evaluation_and_Improvement/1133879 Habermann, 2014].  
+
Established metadata terminology is the result of a multi-decade, cooperative effort between metadata experts in NOAA, NASA and other U.S. Federal Agencies.  The selection below is intended to provide a framework of basic terminology in order to facilitate understanding of fundamental concepts.  More in-depth information and application examples are available in [http://figshare.com/articles/Metadata_Evaluation_and_Improvement/1133879 Habermann, 2014].  
  
 +
=Terms=
 +
<table border="1" cellpadding="3">
 +
    <tr>
 +
        <td valign="top"><b>Collection</b></td>
 +
        <td valign="top">A group of metadata records commonly organized by a data facility, organization or project and often stored in a database or web accessible folder.
 +
        </td>
 +
    </tr>
 +
    <tr>
 +
        <td valign="top"><b>Concept</b></td>
 +
        <td valign="top">General term for describing a documentation entity. Concepts are independent of dialect so a single concept can occur in many dialects. They are typically represented (in XML) by an element or a collection of elements.</td>
 +
    </tr>
 +
    <tr>
 +
        <td valign="top"><b>Dialect</b></td>
 +
        <td valign="top">A particular representation of metadata that is specific to a community. Examples include Content Standard for Digital Geographic metadata (CSDGM), Directoy Interchange Format (DIF), ISO 19115-3 (XML representation of ISO 19115-1 conceptual model).</td>
 +
    </tr>
 +
    <tr>
 +
        <td valign="top"><b>Dialect Maximum</b></td>
 +
        <td valign="top">The number of concepts from a particular recommendation that are included in a particular dialect and, therefore, the maximum number of concepts from that recommendation that can be represented in the dialect. Note:  the dialect maximum is always less than or equal to the number of concepts included in the recommendation (recommendation maximum).</td>
 +
    </tr>
 +
    <tr>
 +
        <td valign="top"><b>Documentation</b></td>
 +
        <td valign="top">The complete collection of unstructured written, drawn, presented or recorded materials necessary for discovering, accessing, understanding, and reproducing scientific data and results.</td>
 +
    </tr>
  
== METADATA CONCEPTS: ==
+
    <tr>
A metadata concept is a way of describing contextual information – independent of dialect.
+
        <td valign="top"><b>Element</b></td>
 
+
        <td valign="top">An item providing a value for a concept, typically in an XML representation. Elements depend on dialects. They are the instantiation of a concept in a dialect.</td>
It is these concepts within the metadata that make the data discoverable, accessible, and usable. Metadata concepts can be described at a general level or include more detailed information.
+
    </tr>
 
+
    <tr>
Ex:  “Spatial Extent” is a high level metadata concept that can be addressed in a general manner; or it can include more detailed concepts like bounding latitude/longitude box or geographic identifiers.  
+
        <td valign="top"><b>Level</b></td>
 
+
        <td valign="top">Recommendations may have different degrees of necessity associated with a concept's occurrence in a record e.g. mandatory, recommended, and suggested. These subsets of concepts within a recommendation are called levels.
 
+
        </td>
== DIALECTS: ==
+
    </tr>
'''Documentation''': A set of unstructured written, drawn, presented or recorded representations of thought(s).  
+
    <tr>
 +
        <td valign="top"><b>Metadata</b></td>
 +
        <td valign="top">Structured and standardized elements of scientific documentation</td>
 +
    </tr>
 +
    <tr>
 +
        <td valign="top"><b>Recommendation</b></td>
 +
        <td valign="top">A set of concepts that an organization identifies for achieving a documentation goal.
 +
        </td>
 +
    </tr>
 +
    <tr>
 +
        <td valign="top"><b>Recommendation Maximum</b></td>
 +
        <td valign="top">The number of concepts included in a recommendation. Note that the recommendation maximum is the maximum completeness score available for a metadata record being evaluated with respect to that recommendation. The recommendation maxima are always greater than or equal to all dialect maxima for that recommendation.
 +
        </td>
 +
    </tr>
 +
    <tr>
 +
        <td valign="top"><b>Signature</b></td>
 +
        <td valign="top">A series of numbers that give the number of concepts/elements missing from a metadata record (or a group of metadata records) in a series of spirals. Signatures with low numbers indicate fewer missing elements and a signature made up completely of 0's indicates a record or group of records that is complete with respect to a particular recommendation/dialect combination. A signature of 2 3 indicates that 2 elements are missing from the first spiral and 3 are missing from the second. The sum of the numbers in a signature is the total number of elements missing from a record or group of records.
 +
        </td>
 +
    </tr>
 +
    <tr>
 +
        <td valign="top"><b>Spiral</b></td>
 +
        <td valign="top">A set of concepts required to support a particular documentation need or use case.
 +
        </td>
 +
    </tr>
 +
</table>
  
 +
=Notes=
 +
== Concepts== 
 +
Concepts can be described at a general level or include more detailed information, e.g. “Spatial Extent” is a high level metadata concept that can be addressed in a general manner; or it can include more detailed concepts like bounding latitude/longitude box or geographic identifiers.
 +
==Documentation==
 
*''Examples include but are not limited to: Notebooks, scientific papers, web pages, user guides, word processing documents, spreadsheets, data dictionaries, PDF’s, custom binary and ASCII formats, and many others — each with associated storage and preservation strategies.''  
 
*''Examples include but are not limited to: Notebooks, scientific papers, web pages, user guides, word processing documents, spreadsheets, data dictionaries, PDF’s, custom binary and ASCII formats, and many others — each with associated storage and preservation strategies.''  
  
 
More often than not, the scientific process is documented, stored, and circulated using different tools and approaches depending on the needs of an exclusive group within the scientific community.  This customized, often unstructured approach may work well for independent investigators or in the confines of a particular community; but for users outside of these small groups, it creates significant complications with discovering, accessing, using, and understanding (Space keeper - link to  Metadata Recommendations – Background) for an explanation of these 4 processes) the data.   
 
More often than not, the scientific process is documented, stored, and circulated using different tools and approaches depending on the needs of an exclusive group within the scientific community.  This customized, often unstructured approach may work well for independent investigators or in the confines of a particular community; but for users outside of these small groups, it creates significant complications with discovering, accessing, using, and understanding (Space keeper - link to  Metadata Recommendations – Background) for an explanation of these 4 processes) the data.   
  
 
+
==Spirals==
'''Metadata''': A structured framework that describes well-defined contextualizing information about the data
 
 
 
This structured, consistent approach facilitates discovery, access, use, and interpretability of datasets by users outside the investigation group.  It also enables integration of metadata into discovery and analysis tools and provides consistent references from the metadata to external documentation.
 
 
 
 
 
'''Metadata Standards''':  Community developed, structured approaches to expressing concepts.
 
 
 
Metadata content can be approached in a variety of “dialects,” depending on the needs of specific user communities.  Thus, exclusive communities have developed their own standard approaches to describing data for discovery, accessibility, usability and understandability purposes.  The results of these efforts are frequently referred to as “metadata standards,” which vary from group to group.
 
 
 
 
 
'''Metadata Dialects''': A collection of similar and connecting standards that represent variations of a universal documentation language.
 
 
 
While there are difference between metadata “standards,” it should be noted that these seemingly unique “standards” also significantly overlap – as the “who, where, when, why, and how” must always be addressed, regardless of the community approach.  Thus, in reality, these standards are more akin to dialects of a universal documentation language than multiple, disparate languages.  As such, for the purposes of this work, the term “metadata dialect” will be substituted for “metadata standard” to promote the understanding that these are different ways of expressing the same ideas.
 
 
 
 
 
== SPIRALS: ==
 
  
 
[[File:Spiral.png|thumb|Visual Depiction of the Spiral Model]]
 
[[File:Spiral.png|thumb|Visual Depiction of the Spiral Model]]
  
 
'''Spiral Model:'''  Like any language, metadata dialects are living entities that must evolve and expand in response to newly developed requirements within user communities.  This constantly escalating effort inherently introduces increasing levels of complexity.  To promote progress and metadata improvement, a Spiral model that utilizes small, actionable iterations is employed. Following this model, communities are able to improve their metadata over time.   
 
'''Spiral Model:'''  Like any language, metadata dialects are living entities that must evolve and expand in response to newly developed requirements within user communities.  This constantly escalating effort inherently introduces increasing levels of complexity.  To promote progress and metadata improvement, a Spiral model that utilizes small, actionable iterations is employed. Following this model, communities are able to improve their metadata over time.   
 
  
 
The metadata improvement process is divided into a series of steps that can be defined and accomplished in a clear and orderly manner.   
 
The metadata improvement process is divided into a series of steps that can be defined and accomplished in a clear and orderly manner.   
 
  
 
Each loop in the spiral represents a concept that is divided into 4 quadrants consisting of:   
 
Each loop in the spiral represents a concept that is divided into 4 quadrants consisting of:   
Line 50: Line 83:
 
:*Plan Next Iteration
 
:*Plan Next Iteration
  
 +
There is no limit to the # of loops that can be added. 
 +
 +
==Rubrics==
 +
Use of a Rubric is one method to assess the completeness of a spiral.
  
There is no limit to the # of loops that can be added.   
+
As discussed above, the metadata improvement process is orchestrated through a series of spirals.  In order to measure progress toward improvement goals and characterize the task to be done, the state of the metadata record in the improvement process (spiral) must be quantitatively and consistently evaluated.  Once a set of spirals for a particular dialect are defined, this can be accomplished through use of a rubric.  The rubric provides a description of the state of a record by arranging the spirals as rows in a table with degree of completeness shown as columns. As the record becomes more complete, the rubric score increases.
[[Category:Documentation Connections]]
+
 
 +
== Recommendations ==
 +
Historically, metadata content has been approached in a variety of ways depending on the needs of specific user communities.  This resulted in the development of multiple metadata “dialects” that must evolve and improve as metadata needs change.  The metadata improvement process is defined and depicted using spirals and the success of the spirals are quantitatively and consistently evaluated using rubrics.
 +
Recommendations are the conclusion to this evaluation process.  They are the metadata elements that are required, recommended, or suggested for a particular community need. Publishing this information in the form of recommendation lists eliminates the need to “reinvent the wheel” and therefore facilitates maximum output for minimum effort.  Along those same lines, it should also be noted that recommendations are a subset of a given dialect’s capabilities necessary to satisfy the specific needs of a particular user communityBy employing only what is necessary to accomplish the task at hand – time, effort, and energy are significantly conserved.
 +
 
 +
[[Category:Documentation_Connections]]

Latest revision as of 15:07, September 16, 2017

Established metadata terminology is the result of a multi-decade, cooperative effort between metadata experts in NOAA, NASA and other U.S. Federal Agencies. The selection below is intended to provide a framework of basic terminology in order to facilitate understanding of fundamental concepts. More in-depth information and application examples are available in Habermann, 2014.

Terms

Collection A group of metadata records commonly organized by a data facility, organization or project and often stored in a database or web accessible folder.
Concept General term for describing a documentation entity. Concepts are independent of dialect so a single concept can occur in many dialects. They are typically represented (in XML) by an element or a collection of elements.
Dialect A particular representation of metadata that is specific to a community. Examples include Content Standard for Digital Geographic metadata (CSDGM), Directoy Interchange Format (DIF), ISO 19115-3 (XML representation of ISO 19115-1 conceptual model).
Dialect Maximum The number of concepts from a particular recommendation that are included in a particular dialect and, therefore, the maximum number of concepts from that recommendation that can be represented in the dialect. Note: the dialect maximum is always less than or equal to the number of concepts included in the recommendation (recommendation maximum).
Documentation The complete collection of unstructured written, drawn, presented or recorded materials necessary for discovering, accessing, understanding, and reproducing scientific data and results.
Element An item providing a value for a concept, typically in an XML representation. Elements depend on dialects. They are the instantiation of a concept in a dialect.
Level Recommendations may have different degrees of necessity associated with a concept's occurrence in a record e.g. mandatory, recommended, and suggested. These subsets of concepts within a recommendation are called levels.
Metadata Structured and standardized elements of scientific documentation
Recommendation A set of concepts that an organization identifies for achieving a documentation goal.
Recommendation Maximum The number of concepts included in a recommendation. Note that the recommendation maximum is the maximum completeness score available for a metadata record being evaluated with respect to that recommendation. The recommendation maxima are always greater than or equal to all dialect maxima for that recommendation.
Signature A series of numbers that give the number of concepts/elements missing from a metadata record (or a group of metadata records) in a series of spirals. Signatures with low numbers indicate fewer missing elements and a signature made up completely of 0's indicates a record or group of records that is complete with respect to a particular recommendation/dialect combination. A signature of 2 3 indicates that 2 elements are missing from the first spiral and 3 are missing from the second. The sum of the numbers in a signature is the total number of elements missing from a record or group of records.
Spiral A set of concepts required to support a particular documentation need or use case.

Notes

Concepts

Concepts can be described at a general level or include more detailed information, e.g. “Spatial Extent” is a high level metadata concept that can be addressed in a general manner; or it can include more detailed concepts like bounding latitude/longitude box or geographic identifiers.

Documentation

  • Examples include but are not limited to: Notebooks, scientific papers, web pages, user guides, word processing documents, spreadsheets, data dictionaries, PDF’s, custom binary and ASCII formats, and many others — each with associated storage and preservation strategies.

More often than not, the scientific process is documented, stored, and circulated using different tools and approaches depending on the needs of an exclusive group within the scientific community. This customized, often unstructured approach may work well for independent investigators or in the confines of a particular community; but for users outside of these small groups, it creates significant complications with discovering, accessing, using, and understanding (Space keeper - link to Metadata Recommendations – Background) for an explanation of these 4 processes) the data.

Spirals

Visual Depiction of the Spiral Model

Spiral Model: Like any language, metadata dialects are living entities that must evolve and expand in response to newly developed requirements within user communities. This constantly escalating effort inherently introduces increasing levels of complexity. To promote progress and metadata improvement, a Spiral model that utilizes small, actionable iterations is employed. Following this model, communities are able to improve their metadata over time.

The metadata improvement process is divided into a series of steps that can be defined and accomplished in a clear and orderly manner.

Each loop in the spiral represents a concept that is divided into 4 quadrants consisting of:

  • Determine Objectives
  • Identify and Resolve Risks
  • Development and Testing
  • Plan Next Iteration

There is no limit to the # of loops that can be added.

Rubrics

Use of a Rubric is one method to assess the completeness of a spiral.

As discussed above, the metadata improvement process is orchestrated through a series of spirals. In order to measure progress toward improvement goals and characterize the task to be done, the state of the metadata record in the improvement process (spiral) must be quantitatively and consistently evaluated. Once a set of spirals for a particular dialect are defined, this can be accomplished through use of a rubric. The rubric provides a description of the state of a record by arranging the spirals as rows in a table with degree of completeness shown as columns. As the record becomes more complete, the rubric score increases.

Recommendations

Historically, metadata content has been approached in a variety of ways depending on the needs of specific user communities. This resulted in the development of multiple metadata “dialects” that must evolve and improve as metadata needs change. The metadata improvement process is defined and depicted using spirals and the success of the spirals are quantitatively and consistently evaluated using rubrics. Recommendations are the conclusion to this evaluation process. They are the metadata elements that are required, recommended, or suggested for a particular community need. Publishing this information in the form of recommendation lists eliminates the need to “reinvent the wheel” and therefore facilitates maximum output for minimum effort. Along those same lines, it should also be noted that recommendations are a subset of a given dialect’s capabilities necessary to satisfy the specific needs of a particular user community. By employing only what is necessary to accomplish the task at hand – time, effort, and energy are significantly conserved.