Interagency Data Stewardship/LifeCycle/Jan2011Meeting

From Earth Science Information Partners (ESIP)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

ESIP 2011 Winter Meeting Plans

Tuesday, January 4, 2011

2:00-3:30 Citation guidelines and identifiers =

    • Presentation on citations - Mark P.
    • Short presentations on ID's paper and testbed - Ruth and Nancy
    • Walk through of FOO related examples - Curt
    • Discussion and develop plan
Notes from Session
  • Encourage publishers to enforce citation requirements
    • Papers need a unique name and location (URL + URN)
    • Location independent- copies everywhere have the same ID. Can be made without internet access or naming authority
    • Unique locator- location is invariant and can always be found in at least one place
    • Citable identified- Same as unique locator but also accepted by publishers, reduces clutter from granule level citation
    • Scientifically unique identifier- possible to verify that contents are unchanged after format change/rearrangement - ensures that data does not get tampered with and remains untouched
    • Different id schemes were assessed based on technical value, user value, archive value, and existing usage in data centers.
      • UUID most promising for Unique identifier
      • Most are fine for Unique locators
      • DOI most suitable for Citable locator
      • No existing models are optimized for Scientifically unique identifier
      • Different schemes solve different problems, plan on supporting lots of identifiers continuously as they go in and out of service- Best recommendation: a UUID and DOI at minimum.
    • Also suggested: use UUI for collection identification only and relegate details to metadata.
    • Follow up plan- work to have UUID granules/files and DOI data sets set as NASA standards


4:00-5:30 Towards a Earth Science provenance/context content standard - Part I

Notes from Session
  • Review Earth science provenance/context requirements
    • Data extremes are often broken down into archival units
    • Controlled vocabulary needed for distinguishing data types (defines format, granularity, etc.)
    • Data versioning is more complicated than software versions- the same data from the same system but with different calibrations could have different version names
  • How to distinguish all individual granules?
    • Example: Using FOO satellite data, tag each granule w/ UUID then DOI for the whole collection of granules
    • Problem, corruption errors in archives results in deletion and replacement of the data. now experiment cannot be replicated
    • Providence information is retained for original data, even so data itself is deleted
    • If corrupted data is remade, it gets a new UUID...how does the reproduced experiment get cited?
    • Very messy problem- who made it? Does anyone deserve credit for reformatting it?
    • Is it possible to make it reproducible or to make it cite-able?
  • DOI and UUID have limitations
    • So consider a "process on demand" Dataset and an ephemeral "data transformation" web service
    • Can you look at data citations and determine if two researchers are using same data granules?
  • Begin to develop a plan for creating the standard
    • Should the federation develop citation guidelines and best practices for the use of identifiers?
    • Other organizations are already doing it, does ESIP need to as well?
    • ESIP should explicitly clarify roles and functions of identifiers for the organizations creating standards.
    • Establish principles on which the identity of data are assigned
  • Proposals for citation can be measured against this criteria
  • Be ready now to tell the scientific community how to cite data
  • Enabling citation to lead to reproducibility standard

CONCLUSIONS

  • Identify roles and functions of identity
  • Recommend which identifiers are appropriate
  • Guidelines how to cite ESIP data
  • Develop guidelines with recognition that its ongoing process that will continuously improve
  • Who to work with for this?
    • DATAcite, among others

Wednesday, January 5, 2011

  • 1:45-3:15 Towards a Earth Science provenance/context content standard - Part II
    • Complete plan for standards development
  • 3:45-5:15 Towards an Earth Science provenance/context ontology - Part I

Thursday, January 6, 2011

  • 10:30-12:00 Towards an Earth Science provenance/context ontology - Part II
    • Refine use cases
    • Complete plan to develop ES Provenance/Context Ontology
  • 1:30-3:00 Cluster business meeting
    • Chair/co-chair election - 15 min
    • Summarize results and plans from sessions ~ 30 min
    • Moving testbed activities forward ~ 30 min