Difference between revisions of "Individual, Organization, and Role Documentation"

From Earth Science Information Partners (ESIP)
Line 12: Line 12:
 
===Usage===
 
===Usage===
  
 +
<table border="1" cellpadding="6">
 +
<tr><td></td><td></td></tr>
 +
<tr><td></td><td></td></tr>
 +
<tr><td></td><td></td></tr>
 +
</table>
  
 
==ECHO==
 
==ECHO==
Line 21: Line 26:
 
===Usage===
 
===Usage===
  
 +
<table border="1" cellpadding="6">
 +
<tr><td></td><td></td></tr>
 +
<tr><td></td><td></td></tr>
 +
<tr><td></td><td></td></tr>
 +
</table>
  
 
==FGDC==
 
==FGDC==
Line 30: Line 40:
 
===Usage===
 
===Usage===
  
 +
<table border="1" cellpadding="6">
 +
<tr><td></td><td></td></tr>
 +
<tr><td></td><td></td></tr>
 +
<tr><td></td><td></td></tr>
 +
</table>
  
 
==ISO 19115==
 
==ISO 19115==
Line 38: Line 53:
  
 
===Usage===
 
===Usage===
 +
 +
<table border="1" cellpadding="6">
 +
<tr><td></td><td></td></tr>
 +
<tr><td></td><td></td></tr>
 +
<tr><td></td><td></td></tr>
 +
</table>
  
 
=How are Authors Documented?=
 
=How are Authors Documented?=

Revision as of 14:11, September 18, 2015

Overview

Documenting people is critical for many discovery, use and understanding use cases. People may have many different roles within the data creation and management life cycle. This page provides the location (xpath) in the metadata for documenting people in variety of different roles for the DIF, ECHO and ISO 19115-2 dialects. XML examples are also provided for each of the dialects.

Implementation

DIF

Structure

Usage

ECHO

Structure

Usage

FGDC

Structure

Usage

ISO 19115

Structure

Usage

How are Authors Documented?

CSDGM

<idinfo>
        <citation>
            <citeinfo>
                <origin>Matthew Granitto</origin>
                <origin>Elizabeth A. Bailey</origin>
                <origin>Jeanine M. Schmidt</origin>
                <origin>Nora B. Shew</origin>
                <origin>Bruce M. Gamble</origin>
                <origin>Keith A. Labay</origin>
            </citeinfo>
        </citation>
</idinfo>

DIF

<Data_Set_Citation>
      <Dataset_Creator>TES Science Team (Scott Gluck, NASA/ASDC)</Dataset_Creator>
</Data_Set_Citation>

ECHO


ISO

<gmd:CI_ResponsibleParty>
     <gmd:individualName>
          <gco:CharacterString>Yosemite Sam</gco:CharacterString>
     </gmd:individualName>
     <gmd:organisationName>
          <gco:CharacterString>ACME Corporation</gco:CharacterString>
     </gmd:organisationName>
     <gmd:role>
          <gmd:CI_RoleCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode" codeListValue="author">author</gmd:CI_RoleCode>
     </gmd:role>
</gmd:CI_ResponsibleParty>

How are Originators Documented?

CSDGM <pre<noinclude></noinclude>><metadata>

   <idinfo>
       <citation>
           <citeinfo>
               <origin>Mark J. Johnsson</origin>
               <origin>David G. Howell</origin>
               <pubdate>1996</pubdate>
               <title>Generalized Thermal Maturity Map of Alaska</title>
               <geoform>map</geoform>
               <serinfo>
                   <sername>U.S. Geological Survey Miscellaneous Investigations Series Map</sername>
                   <issue>I-2494</issue>
               </serinfo>
               <pubinfo>
                   <pubplace>Menlo Park, CA</pubplace>
                   <publish>U.S. Geological Survey</publish>
               </pubinfo>
               <onlink>http://pubs.usgs.gov/dds/dds-54/Map/</onlink>
           </citeinfo>
       </citation>
   </idinfo>

</metadata>

DIF <pre<noinclude></noinclude>><Data_Set_Citation>

   <Dataset_Creator>TES Science Team (Scott Gluck, NASA/ASDC)</Dataset_Creator>
   <Dataset_Editor>S. Gluck</Dataset_Editor>
   <Dataset_Title>TES Aura L3 Deuterium Oxide (HDO) Monthly Gridded V003</Dataset_Title>
   <Dataset_Series_Name>TL3HDOM</Dataset_Series_Name>
   <Dataset_Release_Date>2013</Dataset_Release_Date>
   <Dataset_Release_Place>Hampton, VA, USA</Dataset_Release_Place>
   <Dataset_Publisher>NASA Langley Research Center (LaRC) Atmospheric Science Data Center (ASDC)</Dataset_Publisher>
   <Version>003</Version>
   <Data_Presentation_Form>Digital Science Data</Data_Presentation_Form>
   <Dataset_DOI>10.5067/AURA/TES/TESTL3HDOM_L3</Dataset_DOI>
   <Online_Resource>https://eosweb.larc.NASA.gov/project/tes/tes_tl3hdom_table</Online_Resource>

</Data_Set_Citation>

ECHO <pre<noinclude></noinclude>><Contact>

   <Role>Investigator</Role>
   <HoursOfService>9-5 Pacific weekdays</HoursOfService>
   <Instructions>Contact by email first</Instructions>
   <OrganizationAddresses>
       <Address>
           <StreetAddress>4800 Oak Grove Drive</StreetAddress>
           <City>Pasadena</City>
           <StateProvince>CA</StateProvince>
           <PostalCode>91109</PostalCode>
           <Country>USA</Country>
       </Address>
   </OrganizationAddresses>
   <OrganizationPhones>
       <Phone>
           <Number>818-354-6319</Number>
           <Type>phone</Type>
       </Phone>
   </OrganizationPhones>
   <OrganizationEmails>
       <Email>Dave.Diner@jpl.nasa.gov</Email>
   </OrganizationEmails>
   <ContactPersons>
       <ContactPerson>
           <FirstName>Dave</FirstName>
           <MiddleName>J.</MiddleName>
           <LastName>Diner</LastName>
           <JobPosition>Technical Contact for Science</JobPosition>
       </ContactPerson>
   </ContactPersons>

</Contact>

ISO <pre<noinclude></noinclude>><gmd:contact>

   <gmd:CI_ResponsibleParty>
     <gmd:individualName>
       <gco:CharacterString>
         Christine R. Martin
       </gco:CharacterString>
     </gmd:individualName>
     <gmd:organisationName>
       <gco:CharacterString>
         University of Alaska - Fairbanks
       </gco:CharacterString>
     </gmd:organisationName>
     <gmd:positionName>
       <gco:CharacterString>
         Alaska Geobotany Center
       </gco:CharacterString>
     </gmd:positionName>
     <gmd:contactInfo>
       <gmd:CI_Contact>
         <gmd:address>
           <gmd:CI_Address>
             <gmd:city>
               <gco:CharacterString>
                 Fairbanks
               </gco:CharacterString>
             </gmd:city>
             <gmd:administrativeArea>
               <gco:CharacterString>
                 Alaska
               </gco:CharacterString>
             </gmd:administrativeArea>
             <gmd:country>
               <gco:CharacterString>
                 USA
               </gco:CharacterString>
             </gmd:country>
             <gmd:electronicMailAddress>
               <gco:CharacterString>
                 fncrm@uaf.edu
               </gco:CharacterString>
             </gmd:electronicMailAddress>
           </gmd:CI_Address>
         </gmd:address>
       </gmd:CI_Contact>
     </gmd:contactInfo>
     <gmd:role>
       <gmd:CI_RoleCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode" codeListValue="originator">
         originator
       </gmd:CI_RoleCode>
     </gmd:role>
   </gmd:CI_ResponsibleParty>
 </gmd:contact>

How are Publishers Documented?

CSDGM <pre<noinclude></noinclude>><metadata>

   <idinfo>
       <citation>
           <citeinfo>
               <origin>Mark J. Johnsson</origin>
               <origin>David G. Howell</origin>
               <pubdate>1996</pubdate>
               <title>Generalized Thermal Maturity Map of Alaska</title>
               <geoform>map</geoform>
               <serinfo>
                   <sername>U.S. Geological Survey Miscellaneous Investigations Series Map</sername>
                   <issue>I-2494</issue>
               </serinfo>
               <pubinfo>
                   <pubplace>Menlo Park, CA</pubplace>
                   <publish>U.S. Geological Survey</publish>
              </pubinfo>
               <onlink>http://pubs.usgs.gov/dds/dds-54/Map/</onlink>
           </citeinfo>
       </citation>
   </idinfo>

</metadata>

DIF <pre<noinclude></noinclude>><Data_Set_Citation>

   <Dataset_Creator>TES Science Team (Scott Gluck, NASA/ASDC)</Dataset_Creator>
   <Dataset_Editor>S. Gluck</Dataset_Editor>
   <Dataset_Title>TES Aura L3 Deuterium Oxide (HDO) Monthly Gridded V003</Dataset_Title>
   <Dataset_Series_Name>TL3HDOM</Dataset_Series_Name>
   <Dataset_Release_Date>2013</Dataset_Release_Date>
   <Dataset_Release_Place>Hampton, VA, USA</Dataset_Release_Place>
   <Dataset_Publisher>NASA Langley Research Center (LaRC) Atmospheric Science Data Center (ASDC)</Dataset_Publisher>
   <Version>003</Version>
   <Data_Presentation_Form>Digital Science Data</Data_Presentation_Form>
   <Dataset_DOI>10.5067/AURA/TES/TESTL3HDOM_L3</Dataset_DOI>
   <Online_Resource>https://eosweb.larc.NASA.gov/project/tes/tes_tl3hdom_table</Online_Resource>

</Data_Set_Citation>

ISO <pre<noinclude></noinclude>><gmd:citedResponsibleParty>

        <gmd:CI_ResponsibleParty>
             <gmd:organisationName>
               <gco:CharacterString>
                 UCAR/NCAR - Earth Observing Laboratory
               </gco:CharacterString>
             </gmd:organisationName>
             <gmd:positionName>
               <gco:CharacterString>
                 EOL Data Support
               </gco:CharacterString>
             </gmd:positionName>
             <gmd:contactInfo>
               <gmd:CI_Contact>
                 <gmd:address>
                   <gmd:CI_Address>
                     <gmd:deliveryPoint>
                       <gco:CharacterString>
                         PO Box 3000
                       </gco:CharacterString>
                     </gmd:deliveryPoint>
                     <gmd:city>
                       <gco:CharacterString>
                         Boulder
                       </gco:CharacterString>
                     </gmd:city>
                     <gmd:administrativeArea>
                       <gco:CharacterString>
                         CO
                       </gco:CharacterString>
                     </gmd:administrativeArea>
                     <gmd:postalCode>
                       <gco:CharacterString>
                         80307-3000
                       </gco:CharacterString>
                     </gmd:postalCode>
                     <gmd:country>
                       <gco:CharacterString>
                         USA
                       </gco:CharacterString>
                     </gmd:country>
                   </gmd:CI_Address>
                 </gmd:address>
                 <gmd:onlineResource>
                   <gmd:CI_OnlineResource>
                     <gmd:linkage>
                       <gmd:URL>
                         http://data.eol.ucar.edu/
                       </gmd:URL>
                     </gmd:linkage>
                     <gmd:name>
                       <gco:CharacterString>
                         homepage
                       </gco:CharacterString>
                     </gmd:name>
                   </gmd:CI_OnlineResource>
                 </gmd:onlineResource>
               </gmd:CI_Contact>
             </gmd:contactInfo>
             <gmd:role>
               <gmd:CI_RoleCode codeList="http://www.isotc211.org/2005/resources/Codelist/gmxCodelists.xml#CI_RoleCode" codeListValue="publisher">
               publisher
               </gmd:CI_RoleCode>
             </gmd:role>
           </gmd:CI_ResponsibleParty>

</gmd:citedResponsibleParty>

How are Contributors Documented?

Notes

CodeLists

Codelists are shared vocabularies used throughout the ISO Standards to provide a (usually small) set of choices for the value of an element. In many cases they provide a standard set of tags that can be used for classifying an object. They can be identified in the UML because their types end with "Code", i.e. CI_RoleCode.

All codeLists share codeList and codeListValue attributes that give the location of the codeList and the value from the codeList being used in a particular case. Multiple codelists can be stored in a single codeListCatalog (see http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/codelist/gmxCodelists.xml for an example), so the location usually includes a URL and an anchor for the specific codeList. The codeList values are given in the attribute and as the value of the codeList element: <ns:codeListName codeList="URL#codeListName" codeListValue="value">value</ns:codeListName>

See [CodeLists] for a list of all ISO CodeLists. 

Roles-by-Position vs. Roles-by-Code

People can play many different roles in the life-cycle of scientific datasets. There are two ways that those roles can be reflected in a metadata structure: by position and by code. Many people are familiar with the roles by position approach because that is the approach used in the FGDC CSDGM. The person referenced from the metadata section is the metadata contact, the person referenced from the distribution section is the distributor, and so on. Using this approach means that the object that holds information about people does not need any role indicator. That information is supplied by the position of the person in the structure.

The ISO Standards combine the roles-by-position approach with the roles-by-code approach. Roles can generally be inferred from the positions of CI_ResponsibleParty objects in the structure, but flexibility is increased by adding a code for role to the each object. This is helpful when citing a dataset that involves people in multiple roles (principle investigator, publisher, author, resourceProvider) or when specifying the point of contact.

The roles-by-position approach allows the roles of the people involved with a dataset to be known even when they are accessed separately. For example, a specific xPath can be used if one were interested in the metadata contact for a resource: (/gmi:MI_Metadata/gmd:contact), but a general xPath (//gmd:CI_ResponsibleParty) can be used to answer the general question "what people or organizations are associated with this dataset". In the latter case, the role code provides information about roles even though the people are being accessed independently.

Multiple CI_ResponsibleParties can be included in almost all ISO objects that can include CI_ResponsibleParties. In those cases, roleCodes can be used to associate appropriate roles with particular people if necessary. For example, the ISO CI_Citation object is used to refer to a variety of resources that are not included in a metadata record. It is modeled after a bibliographic reference and can include any number of organizations or people (CI_ResponsibleParties) in any roles. Typically a CI_Citation includes originators or authors and a publisher. 

Schema vs. Schematron

The only required element in the CI_ResponsibleParty object is the role. As in the case of the CI_OnlineResource, a CI_ResponsibleParty with only the required field(s) is not very useful. In this case, however, no reasonable solution can be achieved by requiring individualName or organisationName or positionName. The solution is to constrain the object by requiring that the count of individualName + organizationName + positionName be greater than or equal to one. In other words, at a minimum one of these three elements must exist.

There are two techniques that can be used to test the "validity" of ISO metadata in XML. The first is to use the XML schema which defines the structure and types of the elements and the number of times they can occur. The schema rules are expressed as the cardinality in the UML descriptions used in this wiki. The CI_ResponsibleParty constraint described above cannot be specified in an XML schema document and so cannot be tested using simple schema validation. Instead, a tool called Schematron can be used to test constraints or business rules that are included in the UML. Many times these rules involve multiple elements, as in the CI_ResponsibleParty case. In some cases an organization can specify several sets of schematron rules to test conformance at different levels.