Summary of Results and Issues using DOI's for entire data sets

  • The schema used to validate a DOI request changes with time; the one originally provided no longer worked because of this. At the time of writing the current schema is here.
  • DOI's can be assigned in batches - but the XML for each batch must be less than 5K in size
  • The XML consists of a header and body
    • The header contains the following fields:
      • Batch ID
      • Timestamp
      • Depositor information (name and email)
      • Registrant information (name)
    • The body can contain a list of databases. For each database the following fields are used:
      • Database metadata
        • Database metadata language
        • Database title
        • Publisher (name and location)
        • Institution (name, acronym, place, department)
      • Dataset
        • Dataset type (typically collection)
        • Contributors (name, role (e.g., author), sequence (ie., first, second, etc.)
        • Dataset title
        • Dataset description
        • Format
        • Citation_list
        • Component_list
        • Database Date (publication date, update date, or creation date)
        • DOI data (doi value, URL) - the only required fields

Best Practices

  • Do not include an organization name in the DOI you assign...
  • Do not assign a DOI to a data set unless you intend to permanently make at least the metadata for the data set available (perhaps with a retired status)