Difference between revisions of "Individual, Organization, and Role Documentation"

From Earth Science Information Partners (ESIP)
Line 2: Line 2:
 
Documenting people is critical for many discovery, use and understanding use cases. People may have many different roles within the data creation and management life cycle.  This page provides the location (xpath) in the metadata for documenting people in variety of different roles for the DIF, ECHO and ISO 19115-2 dialects.  XML examples are also provided for each of the dialects.
 
Documenting people is critical for many discovery, use and understanding use cases. People may have many different roles within the data creation and management life cycle.  This page provides the location (xpath) in the metadata for documenting people in variety of different roles for the DIF, ECHO and ISO 19115-2 dialects.  XML examples are also provided for each of the dialects.
  
=DIF=
+
=Where are people Documented?=
 +
 
 +
==DIF==
  
 
<table border="1" cellpadding="1">
 
<table border="1" cellpadding="1">
Line 11: Line 13:
 
</table>
 
</table>
  
=ECHO=
+
==ECHO==
  
 
<table border="1" cellpadding="1">
 
<table border="1" cellpadding="1">
Line 20: Line 22:
 
</table>
 
</table>
  
=ISO 19115=
+
==ISO 19115==
  
 
<table border="1" cellpadding="1">
 
<table border="1" cellpadding="1">
Line 97: Line 99:
 
   </tr>
 
   </tr>
 
</table><br/>
 
</table><br/>
 +
 +
=How are people Documented=
 +
 +
==DIF==
 +
 +
 +
==ECHO==
 +
 +
 +
==ISO 19115-2==
  
 
= Implementation (XML) =
 
= Implementation (XML) =

Revision as of 11:40, September 16, 2015

Overview

Documenting people is critical for many discovery, use and understanding use cases. People may have many different roles within the data creation and management life cycle. This page provides the location (xpath) in the metadata for documenting people in variety of different roles for the DIF, ECHO and ISO 19115-2 dialects. XML examples are also provided for each of the dialects.

Where are people Documented?

DIF

ECHO

ISO 19115

ISO 19115 and 19115-1

The ISO metadata standards support identification of individuals and organizations in many roles. Existing NASA metadata standards include mechanisms for identifying individuals and organizations in several important roles.

Metadata Authors - Identifying the authors or points of contact for the metadata content is important so that users that discover errors in the metadata know who to contact. These metadata contacts are included in the contact for the base metadata object (gmd:MD_Metadata, gmi:MI_Metadata, mdb_MD_Metadata, or mdb:MI_Metadata)

ISO 19115 /*/gmd:contact/gmd:CI_ResponsibleParty[[gmd:role/gmd:CI_RoleCode='pointOfContact']
ISO 19115-1 /*/mdb:contact/cit:CI_Responsibility[[cit:role/cit:CI_RoleCode='pointOfContact']

Technical Contacts - Technical contacts are individuals or organizations that can respond to technical or scientific questions that users have about resources. These contacts should be included in both the identification and distribution sections of the metadata.

ISO 19115 /*/gmd:identificationInfo/*/gmd:pointOfContact/gmd:CI_ResponsibleParty[gmd:role/gmd:CI_RoleCode='pointOfContact']
ISO 19115-1 /*/mdb:identificationInfo/*/mri:pointOfContact/cit:CI_Responsibility[cit:role/cit:CI_RoleCode='pointOfContact']

and

ISO 19115 /*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty[gmd:role/gmd:CI_RoleCode='pointOfContact']
ISO 19115-1 /*/mdb:distributionInfo/mrd:MD_Distribution/mrd:distributor/mrd:MD_Distributor/mrd:distributorContact/cit:CI_Responsibility[cit:role/cit:CI_RoleCode='pointOfContact']

Investigators - Investigators are members of the science team that should be included in the citation for the resource.

ISO 19115 /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty/gmd:CI_ResponsibleParty[gmd:role/gmd:CI_RoleCode='principalInvestigator' or gmd:role/gmd:CI_RoleCode='originator']
ISO 19115-1 /*/mdb:identificationInfo/*/mri:citation/cit:CI_Citation/cit:citedResponsibleParty/cit:CI_Responsibility[cit:role/cit:CI_RoleCode='principleInvestigator' or cit:role/cit:CI_RoleCode='originator']

Conceptual Model (UML)

The simple UML for the object is shown here. It includes one required element and four optional elements each of which can occur once.  The CI_Contact includes information about physical and electronic addresses as well as a CI_OnlineResource as part of the contact information. 

ISO Citations can include any number of organizations or people (citedResponsibleParties), each with one of the following roles: resourceProvider, custodian, owner, user, distributor, originator, pointOfContact, principalInvestigator, processor, publisher, or author (see Figure). For example, the principle citation for a metadata record, in the MD_Identification section, can include an author, a publisher, and any number of principal investigators. This is very different than the FGDC approach, where the idinfo section has a citation that can include, but not differentiate roles for, many originators and a single point of contact with no clear role definition.

Roles

ISO 19115 ISO 19115-1
+ resourceProvider
+ custodian
+ owner
+ user
+ distributor
+ originator
+ pointOfContact
+ principalInvestigator
+ processor
+ publisher
+ author
+ sponsor
+ coAuthor
+ collaborator
+ editor
+ mediator
+ rightsHolder
+ contributor
+ funder
+ stakeholder


How are people Documented

DIF

ECHO

ISO 19115-2

Implementation (XML)


 The ISO dialect combines people and organizations into the CI_ResponsibleParty object, a flexible structure that supports many combinations of organizations and people. Most objects that include associated responsible parties can have any number, so, for example, a citation can have people identified in any or all of the roles listed in the CI_RoleCode code list.

The structure of the CI_ResponsibleParty is:

<gmd:CI_ResponsibleParty>
  <gmd:individualName/>
  <gmd:organisationName/>
  <gmd:positionName/>
  <gmd:contactInfo>
    <gmd:CI_Contact>
      <gmd:phone/>
      <gmd:address>
        <gmd:CI_Address>
          <gmd:deliveryPoint/>
          <gmd:city/>
          <gmd:administrativeArea/>
          <gmd:postalCode/>
          <gmd:country/>
          <gmd:electronicMailAddress/>
        </gmd:CI_Address>
      </gmd:address>
      <gmd:onlineResource/>
      <gmd:hoursOfService/>
      <gmd:contactInstructions/>
    </gmd:CI_Contact>
  </gmd:contactInfo>
  <gmd:role/>
</gmd:CI_ResponsibleParty>

Usage


_Where are ResponsibleParty objects?_  People can be connected to ISO metadata records in eight places, and in each CI_Citation. Those locations are shown in this Figure. In some cases the roles of the people are determined by where they are in the standard. In other cases, they are determined by the role code. See the Roles-by-Position vs. Roles-by-Code discussion below for more information.

  |- | Usage | Description and Xpath |- | Citation

!180px-WhereAreCI_ResponsibleParties.CI_Citation.png!


*

| The ISO CI_Citation object is used to refer to a variety of resources that are not included in a metadata record. It is modeled after a bibliographic reference and can include any number of organizations or people (responsibleParties) in any roles. Typically a CI_Citation includes originators or authors and a publisher.


//gmd:CI_Citation/gmd:citedResponsibleParty

|- | *Metadata Contact

*

!180px-WhereAreCI_ResponsibleParties.MD_Metadata.png! | The metadataContact is a person that creates and manages metadata for resources and services. This person generally has expertise in documentation standards and has enough experience and understanding of the resource to document it in partnership with the originator or resource contact. This responsibleParty generally has role = "custodian" or "pointOfContact".


/gmi:MI_Metadata/gmd:contact

|- | Resource Contact

!180px-WhereAreCI_ResponsibleParties.MD_Identification.png!

*

| The CI_ResponsibleParty in MD_Identification objects identifies the pointOfConact for the resource, defined as "identification of, and means of communication with, person(s) and organization(s) associated with the resource(s)". In many cases this person or organization is the Data Manager or the Data Center that preserves the data. These people serve as contacts when the originator of the dataset is no longer available or interested in dealing with questions about the dataset. This person has scientific expertise or experience but may not be a good source for information on data access or data order processing. This responsibleParty generally has role = "pointOfContact".


/gmi:MI_Metadata /gmd:identificationInfo/gmd:MD_DataIdentification/gmd:pointOfContact

|- | User Contact

!180px-WhereAreCI_ResponsibleParties.MD_Usage.png!

*

| The CI_ResponsibleParty in MD_Usage objects identifies people that use the data. This CI_ResponsibleParty generally has the role = "pointOfContact".


/gmi:MI_Metadata /gmd:identificationInfo/gmd:MD_Identification/gmd:resourceSpecific Usage/gmd:MD_Usage/gmd:userContactInfo

|- | Processor

!180px-WhereAreCI_ResponsibleParties.LE_ProcessStep.png!

*

| The CI_ResponsibleParty in LE_ProcessStep objects identifies people that are responsible for processing the data. This CI_ResponsibleParty generally has role = "processor".


/gmi:MI_Metadata /gmd:dataQualityInfo/gmd:MD_DataQuality/gmd:Lineage/gmd:LI_Lineage/gmd:processStep/gmd:LI_ProcessStep/gmd:processor

|- | Resource or Metadata Maintenance Contact

!180px-WhereAreCI_ResponsibleParties.MD_MaintenanceInfo.png!

*

| The CI_ResponsibleParty in MD_MaintenanceInformation objects identifies the people that are responsible for maintaining the resource or the metadata.


/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_Identification/gmd:resourceMaintenance/gmd:MD_MaintenanceInformation/gmd:contact
or
/gmi:MI_Metadata/gmd:metadataMaintenance/gmd:MD_MaintenanceInformation/gmd:contact

|- | Distributor

!180px-WhereAreCI_ResponsibleParties.MD_Distributor.png!

*

| The CI_ResponsibleParty in MD_Distributor objects identifies the people that manage orders and data access at a Data Center. These people have expertise in data access systems but may not be a good source for more scientific information on the resource. This CI_ResponsibleParty generally has role = "distributor".


/gmi:MI_Metadata /gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact

|- | Extension Contact

!180px-WhereAreCI_ResponsibleParties.MD_ExtendedElementInfo.png!

*

| The CI_ResponsibleParty in MD_ExtendedElementInfo objects identifies people that are responsible for creating and maintaining community specific extensions to the standard. This CI_ResponsibleParty generally has role = "pointOfContact".


/gmi:MI_Metadata /gmd:metadataExtensionInfo/gmd:MD_MetadataExtensionInformation /gmd:extendedElementInformation/gmd:MD_ExtendedElementInformation/gmd:source

!180px-WhereAreCI_ResponsibleParties.png | border=1,thumbnail=true,width=180!

 

 

Notes


CodeLists

Codelists are shared vocabularies used throughout the ISO Standards to provide a (usually small) set of choices for the value of an element. In many cases they provide a standard set of tags that can be used for classifying an object. They can be identified in the UML because their types end with "Code", i.e. CI_RoleCode.

All codeLists share codeList and codeListValue attributes that give the location of the codeList and the value from the codeList being used in a particular case. Multiple codelists can be stored in a single codeListCatalog (see http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/codelist/gmxCodelists.xml for an example), so the location usually includes a URL and an anchor for the specific codeList. The codeList values are given in the attribute and as the value of the codeList element: <ns:codeListName codeList="URL#codeListName" codeListValue="value">value</ns:codeListName>

See [CodeLists] for a list of all ISO CodeLists. 

Roles-by-Position vs. Roles-by-Code

People can play many different roles in the life-cycle of scientific datasets. There are two ways that those roles can be reflected in a metadata structure: by position and by code. Many people are familiar with the roles by position approach because that is the approach used in the FGDC CSDGM. The person referenced from the metadata section is the metadata contact, the person referenced from the distribution section is the distributor, and so on. Using this approach means that the object that holds information about people does not need any role indicator. That information is supplied by the position of the person in the structure.

The ISO Standards combine the roles-by-position approach with the roles-by-code approach. Roles can generally be inferred from the positions of CI_ResponsibleParty objects in the structure, but flexibility is increased by adding a code for role to the each object. This is helpful when citing a dataset that involves people in multiple roles (principle investigator, publisher, author, resourceProvider) or when specifying the point of contact.

The roles-by-position approach allows the roles of the people involved with a dataset to be known even when they are accessed separately. For example, a specific xPath can be used if one were interested in the metadata contact for a resource: (/gmi:MI_Metadata/gmd:contact), but a general xPath (//gmd:CI_ResponsibleParty) can be used to answer the general question "what people or organizations are associated with this dataset". In the latter case, the role code provides information about roles even though the people are being accessed independently.

Multiple CI_ResponsibleParties can be included in almost all ISO objects that can include CI_ResponsibleParties. In those cases, roleCodes can be used to associate appropriate roles with particular people if necessary. For example, the ISO CI_Citation object is used to refer to a variety of resources that are not included in a metadata record. It is modeled after a bibliographic reference and can include any number of organizations or people (CI_ResponsibleParties) in any roles. Typically a CI_Citation includes originators or authors and a publisher. 

Schema vs. Schematron

The only required element in the CI_ResponsibleParty object is the role. As in the case of the CI_OnlineResource, a CI_ResponsibleParty with only the required field(s) is not very useful. In this case, however, no reasonable solution can be achieved by requiring individualName or organisationName or positionName. The solution is to constrain the object by requiring that the count of individualName + organizationName + positionName be greater than or equal to one. In other words, at a minimum one of these three elements must exist.

There are two techniques that can be used to test the "validity" of ISO metadata in XML. The first is to use the XML schema which defines the structure and types of the elements and the number of times they can occur. The schema rules are expressed as the cardinality in the UML descriptions used in this wiki. The CI_ResponsibleParty constraint described above cannot be specified in an XML schema document and so cannot be tested using simple schema validation. Instead, a tool called Schematron can be used to test constraints or business rules that are included in the UML. Many times these rules involve multiple elements, as in the CI_ResponsibleParty case. In some cases an organization can specify several sets of schematron rules to test conformance at different levels.

 

Crosswalks

This table reflects the [MENDS Phase 3 voting] results for 5.x items pertaining to the mapping of ECHO and ISO roles.

ISO DIF ECS ECHO
pointOfContact (/*/gmd:contact) DIF AUTHOR   DIF AUTHOR      TECHNICAL CONTACT
originator (//gmd:identificationInfo//gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty)     Data Originator


Producer

|- | distributor (/*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact) or

pointOfContact (xPath)

| TECHNICAL CONTACT | User Services,

Distributor Archive

| Data Center Contact Distributor

DATA CENTER CONTACT
ORNL DAAC User Services
GHRC USER SERVICES
User Services           Archive/Archiver

|- | principalInvestigator (/gmi:MI_Metadata/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:citedResponsibleParty) |  INVESTIGATOR | Investigator

Data Originator

Producer  | Investigator  INVESTIGATOR |- | custodian (/gmi:MI_Metadata/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact) |   |   | Data Manager |- | pointOfContact (/mdb:MD_Metadata/mdb:dataQualityInfo/mdq:DQ_DataQuality/mdq:report/*/mdq:evaluation/mdq:DQ_FullInspection/mdq:evaluationProcedure/cit:CI_Citation/gmd:citedResponsibleParty) |   | Quality Assessment |   |- | pointOfContact (/*/mdb:acquisitionInformation/mac:MI_AcquisitionInformation/mac:instrument/mac:MI_Instrument/mac:citation/cit:CI_Citation/cit:citedResponsibleParty) |   | Instrument |  

_xPath Note:_ {color:#000000} The xPaths included in this table use several wildcards. // means any path, so //gmd:CI_ResponsibleParty indicates a gmd:CI_ResponsibleParty anywhere in an XML file. // indicates a single level with several possible elements. This usually indicates one of several concrete realizations of an abstract object. For example //gmd:identificationInfo could be gmd:MD_Metadata/gmd:identificationInfo or gmi:MI_Metadata/gmd:identificationInfo and gmd:identificationInfo//*/gmd:descriptiveKeywords could be gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords or gmd:identificationInfo/srv:SV_ServiceIdentification/gmd:descriptiveKeywords.{color} Metadata Implementation