Difference between revisions of "Implementation Progress / Issues"

From Earth Science Information Partners (ESIP)
(Created page with "==CrossRef Implementation Issues== ==EZID and DataCite Implementation Issues== '''Mtg Notes from tcw Joan Starr, EZID Service Mgr at CDL:''' '''Questions to Joan:''' '''1. ...")
 
m
Line 5: Line 5:
 
==EZID and DataCite Implementation Issues==
 
==EZID and DataCite Implementation Issues==
  
'''Mtg Notes from tcw Joan Starr, EZID Service Mgr at CDL:'''
+
'''NjH Mtg Notes from tcw Joan Starr, EZID Service Mgr at CDL:'''
  
 
'''Questions to Joan:'''
 
'''Questions to Joan:'''

Revision as of 18:47, December 4, 2010

CrossRef Implementation Issues

EZID and DataCite Implementation Issues

NjH Mtg Notes from tcw Joan Starr, EZID Service Mgr at CDL:

Questions to Joan: 1. It appears from the DOI documentation (DOI Handbook v4.4 pdf) that it should be possible to create citable DOIs for an overall (collection in library terms) dataset, and for digital components or sub-resources within that overall dataset. Is that true?

Answer: Yes, it's possible to create a DOI for digital components or sub-resources within a digital collection per the DOI schema. The EZID service is intended to allow just that for data sets.

2. As it does not appear to be possible at this time to create citable DOIs for sub-resources to a collection level item with CrossRef implementation of DOI, what is the plan for the EZID service?

Two part Answer: DataCite organization and EZID service

The EZID service is designed to allow users to obtain and manage long-term, citable identifiers either for individual or batched digital resources using the DOI or the ARK identifier schemes. The service can create and resolve identifiers on behalf of the user and also allow the user to enter and maintain information about the identifier ("metadata"). Eventually, the service will also allow the deposit of the object to which the identifier refers. The service is available via both a programming interface (an API that software can use) and a web user interface. The service is relatively new and explained more fully on the EZID website at: http://www.cdlib.org/services/uc3/ezid/index.html. Info about the EZID API can be found at: http://www.cdlib.org/uc3/docs/ezidapi.html.

The EZID service must be understood within the context of the DataCite, an international consortium of data creators and collectors to which CDL belongs. (See http://datacite.org/) This consortium has a number of international members who are working with scientific and technical data including national data centers and institutes such as the Australian National Data Service (ANDS), Canada Institute for Scientific and Technical Information, and the British Library. See the current list of members at http://datacite.org/members.html. This group is in the process of finalizing a set of descriptive metadata that they wish to recommend for use for datasets. I have received an early, public version of recommended metadata set (or "kernel" that was put out for public comment in August of this year ("DataCite Metadata Kernel for the Publication and Citation of Research Data"). The list includes both required and optional elements. The first five elements are intended to be enough to create a citation for any resource that has a registered DOI, i.e., [Creator] ([PublicationDate]): [Title]. [Publisher]. [doi:DOI]. [http:dx.doi.org/DOI]. The Metadata group is now in the process of re-drafting the metadata recommendations based on the fairly extensive feedback that they received with the expectation that the finalized list will be released by the end of this calendar year.

Some important facts to consider in using the EZID service:

1. EZID is available via a web based user interface designed for an individual scientist or data creator perhaps, and an API that can be integrated into other services, and probably facilitate batch use. The latter approach seems the one we should take for this project, especially as it could be used for both DOIs and ARKs. Practicability of the approach will need to be investigated by Yuechen, however.

2. At the moment, only University of California or DataOne partners can use the EZID service without prior negotiation with CDL. I suspect the ESIP Federation would not have too much difficulty negotiating use of the service for the testbed at least.

3. To negotiate use of EZID, we would have to establish an EZID User Group with the following responsibilities per the EZID Service Guidelines. See: http://www.cdlib.org/services/uc3/docs/EZIDServiceGuidelines.pdf

3.4 About EZID User Groups: The EZID notion of a group (or "owner group") is an aggregation of users that collectively inherits the identifiers owned by individuals in the group. EZID uses groups in three ways.

  • First, if an individual member of a group is no longer active, for whatever reason, we will work with the group administrator to assign a new owner to the member's identifiers.
  • Second, the EZID group is also the mechanism by which the EZID system controls the categories of identifiers (the “prefix”) an individual member can make using EZID.
  • Lastly, we will use the group record information for billing purposes when we implement our cost recovery program. See section 5.5, Financial Responsibilities.

4.5 Rights/Intellectual Property: The UC Curation Center and CDL make no claims of ownership about identifiers or metadata entered into EZID. Ownership of the identifiers is determined by EZID user and owner group. See section 3.4 above for more information about EZID groups.

Presumably either the ESIP Federation or one of its members could assume the role of the user group? This needs group discussion.

4. The EZID service relies upon 3rd parties to manage the relationship to DataCite, the registration agency for the DOIs, i.e., The German National Library of Science and Technology (TIB) in Hannover, Germany. Currently the primary Handle Servers at TIB and Swiss Federal Institute of Technology (ETH), Zurich, store the core registration records. There is a mirror run by the Corporation for National Research Initiatives (CNRI), in Reston, VA. TIB technical staff members guarantee a minimum of 24/5 service reliability of the resolution and registration infrastructure.

5. What EZID stores / doesn't store & ESIP user group ongoing responsibilities:

  • EZID stores the identifier string and its metadata,and internally generated, administrative metadata.
  • EZID does not store passwords except in encrypted form via a one-way hash for account security purposes. All stored data for the identifiers owned by "our" group would accessible using the API. The method for doing so is included in the API documentation.
  • The ESIP user group would have to take responsibility for maintaining the location of the digital resources, and for maintaining the metadata for the location of the resources per its permanent ID.

6. At present, there is no cost for the EZID service. In the future, however, the UC Curation Center running the service intends to charge a fee for cost recovery purposes. The business model for those costs is not yet public, but should be within several months (depending upon the CA state university bureaucracy).

7. In the paper announcing the draft MD Kernel, the Metadata group provided a comparison of their terms to those specified by DOI among others. It appears from quick review with the metadata that we have for the Glacier data set, we will have all the mandatory metadata, plus other optional metadata that will allow us to create opaque identifiers, and also make use of the descriptive information that is available for each / most of the components of the Glacier Photo data set or "collection". The feasibility of adding the other descriptive info will need to be determined as we test use of the service / API.