Interagency Data Stewardship/LifeCycle/Jan2011Meeting

From Federation of Earth Science Information Partners

ESIP 2011 Winter Meeting Plans

Tuesday, January 4, 2011

2:00-3:30 Citation guidelines and identifiers =

  • Presentation on citations - Mark P.
  • Short presentations on ID's paper and testbed - Ruth and Nancy
  • Walk through of FOO related examples - Curt
  • Discussion and develop plan
Notes from Session
  • Encourage publishers to enforce citation requirements
    • Papers need a unique name and location (URL + URN)
    • Location independent- copies everywhere have the same ID. Can be made without internet access or naming authority
    • Unique locator- location is invariant and can always be found in at least one place
    • Citable identified- Same as unique locator but also accepted by publishers, reduces clutter from granule level citation
    • Scientifically unique identifier- possible to verify that contents are unchanged after format change/rearrangement - ensures that data does not get tampered with and remains untouched
    • Different id schemes were assessed based on technical value, user value, archive value, and existing usage in data centers.
      • UUID most promising for Unique identifier
      • Most are fine for Unique locators
      • DOI most suitable for Citable locator
      • No existing models are optimized for Scientifically unique identifier
      • Different schemes solve different problems, plan on supporting lots of identifiers continuously as they go in and out of service- Best recommendation: a UUID and DOI at minimum.
    • Also suggested: use UUI for collection identification only and relegate details to metadata.
    • Follow up plan- work to have UUID granules/files and DOI data sets set as NASA standards

4:00-5:30 Towards a Earth Science provenance/context content standard - Part I

Notes from Session
  • Review Earth science provenance/context requirements
    • Data extremes are often broken down into archival units
    • Controlled vocabulary needed for distinguishing data types (defines format, granularity, etc.)
    • Data versioning is more complicated than software versions- the same data from the same system but with different calibrations could have different version names
  • How to distinguish all individual granules?
    • Example: Using FOO satellite data, tag each granule w/ UUID then DOI for the whole collection of granules
    • Problem, corruption errors in archives results in deletion and replacement of the data. now experiment cannot be replicated
    • Providence information is retained for original data, even so data itself is deleted
    • If corrupted data is remade, it gets a new does the reproduced experiment get cited?
    • Very messy problem- who made it? Does anyone deserve credit for reformatting it?
    • Is it possible to make it reproducible or to make it cite-able?
  • DOI and UUID have limitations
    • So consider a "process on demand" Dataset and an ephemeral "data transformation" web service
    • Can you look at data citations and determine if two researchers are using same data granules?
  • Begin to develop a plan for creating the standard
    • Should the federation develop citation guidelines and best practices for the use of identifiers?
    • Other organizations are already doing it, does ESIP need to as well?
    • ESIP should explicitly clarify roles and functions of identifiers for the organizations creating standards.
    • Establish principles on which the identity of data are assigned
  • Proposals for citation can be measured against this criteria
  • Be ready now to tell the scientific community how to cite data
  • Enabling citation to lead to reproducibility standard


  • Identify roles and functions of identity
  • Recommend which identifiers are appropriate
  • Guidelines how to cite ESIP data
  • Develop guidelines with recognition that its ongoing process that will continuously improve
  • Who to work with for this?
    • DATAcite, among others

Wednesday, January 5, 2011

  • 1:45-3:15 Towards a Earth Science provenance/context content standard - Part II
    • Complete plan for standards development
  • 3:45-5:15 Towards an Earth Science provenance/context ontology - Part I

Thursday, January 6, 2011

  • 10:30-12:00 Towards an Earth Science provenance/context ontology - Part II
    • Refine use cases
    • Complete plan to develop ES Provenance/Context Ontology
  • 1:30-3:00 Cluster business meeting
    • Chair/co-chair election - 15 min
    • Summarize results and plans from sessions ~ 30 min
    • Moving testbed activities forward ~ 30 min