Difference between revisions of "DOI Landing Pages"

From Earth Science Information Partners (ESIP)
m
Line 7: Line 7:
  
 
==Issues==
 
==Issues==
 +
 +
===Landing Pages===
 +
 +
Should the DOI point to data (DataONE advises that) or a landing page of information about the data?
 +
If the DOI point to metadata, how do you find the data themselves? How can a computer find them?
  
 
===Opacity===
 
===Opacity===
Line 41: Line 46:
 
  Don't encode anything in a DOI that won't be true 100 years from now.
 
  Don't encode anything in a DOI that won't be true 100 years from now.
 
  ... or is needed for any practical use now.
 
  ... or is needed for any practical use now.
 +
 +
===Content===
 +
 +
What information should be on the DOI landing page?
 +
 +
===Organization===
 +
 +
How should the metadata be organized? For human consumption?  For machine access?
 +
 +
One page vs. many pages?
 +
 +
Keep first page simple, and link to other pages.
 +
 +
===Format===
 +
 +
Consider ISO 19115, 19115-2, 19119, with 19139 XML representation for machine access.
  
 
==Examples==
 
==Examples==

Revision as of 05:54, September 10, 2012

DOI Landing pages

[http://vso1.nascom.nasa.gov/rdap/RDAP2012_landingpages_handout.pdf Hourclé, et. al, Linking Articles to Data]

Duerr, et. al, On the utility of identification schemes for digital earth science data: an assessment and recommendations

Issues

Landing Pages

Should the DOI point to data (DataONE advises that) or a landing page of information about the data? If the DOI point to metadata, how do you find the data themselves? How can a computer find them?

Opacity

Should the content of the DOI itself have meaning?

Duerr et. al. Identifiers paper:

 Best practice is that the suffix of the identifier does not include a reference to the archive in case the
 data are moved from the original location where the persistent identifier was assigned initially.

EZID documentation:

Many of the terms in this document serve semantic opacity, which is useful in creating identifiers that age
and travel well ...  Perfect opacity in identifiers is not as important as identifiers' having semantics that
are not widely recognizable, even if those semantics may support administrative activity by specialists
(cf. ISBNs).  ...  For the purposes of longevity, however, it is critical that attachments to authority and
sub-authority names ... be avoided; it is common for political pressures to require sacrificing identifiers
created out of short-sighted administrative or branding convenience.

ESDIS:

An emerging ESDIS DOI convention is to include "DATA" in the DOI to distinguish it from a journal article
or other kind of digital object.

NCAR data citation working group discussions:

Ultimately, our decision was to use completely randomly generated DOIs for the reasons that 1) any
organizational-specific components would become out of date if data are moved between archives or
organizations and 2) any intelligence built into a DOI is another thing that needs to be managed and
maintained over the long term. But it's been a series of discussions to get to this point, and there
are some specific use cases that still argue in the other direction,  human-readability, consolidation
with internal ID schemes, and easier google searching for groups of IDs. 

Frew/Bruce:

Don't encode anything in a DOI that won't be true 100 years from now.
... or is needed for any practical use now.

Content

What information should be on the DOI landing page?

Organization

How should the metadata be organized? For human consumption? For machine access?

One page vs. many pages?

Keep first page simple, and link to other pages.

Format

Consider ISO 19115, 19115-2, 19119, with 19139 XML representation for machine access.

Examples

These examples were provided during a discussion on the ESIP-Preserve Listserv

  1. From Helen Conover at Global Hydrology Resource Center (GHRC)
    For our DOIs, GHRC is using a basic dataset information page generated from our collection level metadata. It includes a brief description and links to data, browse and documentation.
    We have these pages for all of our datasets, though we've only got a few with DOIs, so far. Here's an example: http://ghrc.nsstc.nasa.gov/hydro/details.pl?ds=rssmif17d

  2. From Bob Cook at Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC):
    http://dx.doi.org/10.3334/ORNLDAAC/1081
    http://dx.doi.org/10.3334/ORNLDAAC/1086
    Each of our data sets have a DOI, the characteristics of which are:
    • Persistent
    • Actionable
    • Specific
      • Links to the data set, itself
    • Complete
      • Links to data and the information needed to understand and use the data
    • Machine-readable
      • not yet, but we'd like to
  3. From Mark Parsons at National Snow and Ice Data Center (NSIDC)
    http://nsidc.org/data/nsidc-0176.html
    From that landing page you can click on "view metadata record" which leads you to a variety of different metadata formats including ISO in html and XML. Here is the ISO XML link for the data set above: http://nsidc.org/cgi-bin/get_metadata.pl?id=NSIDC-0176&format=ISO&style=XML
    We don't have good provenance info in the ISO, unfortunately.
    Our DOIs point to the human readable landing page. It would be nice if the identifier knew what was asking, so it could point machines to the XML. I think that is part of the intent with "inflections" in ARKs.

  4. From John Scialdone at Socioeconomic Data and Applications Center (SEDAC): Landing page without DOI
    http://beta.sedac.ciesin.columbia.edu/data/set/epi-environmental-performance-index-pilot-trend-2012

  5. From Bruce Vollmer at Goddard Earth Sciences Data and Information Services Center (GES DISC)
    Following are landing pages for a couple of GES DISC MEaSUREs data sets that have been assigned DOIs:
    http://dx.doi.org/10.5067/MEASURES/GSSTF/DATA301
    http://dx.doi.org/10.5067/MEASURES/GSSTF/DATA311
    These pages are permanent for the DOIs assigned but will likely change in the coming weeks/months as the formatting of the information is cleaned up (e.g., clickable links) and adding other information is explored.
    Initially the landing pages are a reconstitution of GCMD DIF contents (re-use!) and contain only human readable contents that provide information on and access to the data. We are looking into adding additional useful information to the landing pages (beyond GCMD DIF contents) which potentially includes machine readable information.