DOI Landing Pages
DOI Landing pages
[http://vso1.nascom.nasa.gov/rdap/RDAP2012_landingpages_handout.pdf Hourclé, et. al, Linking Articles to Data]
Digital Object Identifiers (DOIs) for EOSDIS (NASA)
Issues
Landing Pages
Should the DOI point to data (DataONE advises that) or a landing page of information about the data? If the DOI point to metadata, how do you find the data themselves? How can a computer find them?
Opacity
Should the content of the DOI itself have meaning?
Duerr et. al. Identifiers paper:
Best practice is that the suffix of the identifier does not include a reference to the archive in case the data are moved from the original location where the persistent identifier was assigned initially.
EZID documentation:
Many of the terms in this document serve semantic opacity, which is useful in creating identifiers that age and travel well ... Perfect opacity in identifiers is not as important as identifiers' having semantics that are not widely recognizable, even if those semantics may support administrative activity by specialists (cf. ISBNs). ... For the purposes of longevity, however, it is critical that attachments to authority and sub-authority names ... be avoided; it is common for political pressures to require sacrificing identifiers created out of short-sighted administrative or branding convenience.
ESDIS:
An emerging ESDIS DOI convention is to include "DATA" in the DOI to distinguish it from a journal article or other kind of digital object.
NCAR data citation working group discussions:
Ultimately, our decision was to use completely randomly generated DOIs for the reasons that 1) any organizational-specific components would become out of date if data are moved between archives or organizations and 2) any intelligence built into a DOI is another thing that needs to be managed and maintained over the long term. But it's been a series of discussions to get to this point, and there are some specific use cases that still argue in the other direction, human-readability, consolidation with internal ID schemes, and easier google searching for groups of IDs.
Frew/Bruce:
Don't encode anything in a DOI that won't be true 100 years from now. ... or is needed for any practical use now.
Content
What information should be on the DOI landing page?
Organization
How should the metadata be organized? For human consumption? For machine access?
One page vs. many pages?
Keep first page simple, and link to other pages.
Format
Consider ISO 19115, 19115-2, 19119, with 19139 XML representation for machine access.
Examples
These examples were provided during a discussion on the ESIP-Preserve Listserv
- From Helen Conover at Global Hydrology Resource Center (GHRC)
For our DOIs, GHRC is using a basic dataset information page generated from our collection level metadata. It includes a brief description and links to data, browse and documentation.
We have these pages for all of our datasets, though we've only got a few with DOIs, so far. Here's an example: http://ghrc.nsstc.nasa.gov/hydro/details.pl?ds=rssmif17d
- From Bob Cook at Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC):
http://dx.doi.org/10.3334/ORNLDAAC/1081
http://dx.doi.org/10.3334/ORNLDAAC/1086
Each of our data sets have a DOI, the characteristics of which are:- Citable Identifier / Citable Locator
- Registered with The DOI System http://dx.doi.org/ through DataCite
- Actionable
- Specific
- Links to the data set, itself
- Complete
- Links to data and the information needed to understand and use the data
- Machine-readable
- not yet, but we'd like to
- Citable Identifier / Citable Locator
- From Mark Parsons at National Snow and Ice Data Center (NSIDC)
http://nsidc.org/data/nsidc-0176.html
From that landing page you can click on "view metadata record" which leads you to a variety of different metadata formats including ISO in html and XML. Here is the ISO XML link for the data set above: http://nsidc.org/cgi-bin/get_metadata.pl?id=NSIDC-0176&format=ISO&style=XML
We don't have good provenance info in the ISO, unfortunately.
Our DOIs point to the human readable landing page. It would be nice if the identifier knew what was asking, so it could point machines to the XML. I think that is part of the intent with "inflections" in ARKs.
- From John Scialdone at Socioeconomic Data and Applications Center (SEDAC): Landing page without DOI
http://beta.sedac.ciesin.columbia.edu/data/set/epi-environmental-performance-index-pilot-trend-2012
- From Bruce Vollmer at Goddard Earth Sciences Data and Information Services Center (GES DISC)
Following are landing pages for a couple of GES DISC MEaSUREs data sets that have been assigned DOIs:
http://dx.doi.org/10.5067/MEASURES/GSSTF/DATA301
http://dx.doi.org/10.5067/MEASURES/GSSTF/DATA311
These pages are permanent for the DOIs assigned but will likely change in the coming weeks/months as the formatting of the information is cleaned up (e.g., clickable links) and adding other information is explored.
Initially the landing pages are a reconstitution of GCMD DIF contents (re-use!) and contain only human readable contents that provide information on and access to the data. We are looking into adding additional useful information to the landing pages (beyond GCMD DIF contents) which potentially includes machine readable information.