Interagency Data Stewardship/LifeCycle/Preservation Forum/TeleconNotes/20101013

From Earth Science Information Partners (ESIP)

Telecon Notes - October 13, 2010


Lei Pan (JPL), Al Fleig, Curt Tilmes, Mark Parsons, Jerry Pan, Bruce Barkstrom, Helen Conover, John Schialdone, Rob Raskin, Chris Lynnes, Yuechen Chi


  • Identifiers paper
  • Identifiers testbed
  • ESIP stewardship principles and best practices draft
    • Mark's update to the citation suggestion
    • Next steps
  • Provenance paper - next steps
  • AGU preparations
  • January meeting preparations

Identifiers paper update

A complete paper was sent out to the authors to review and many have provided comments. Based on the comments, some restructuring will be done before it is submitted:

  • Much of the use case discussion will be pulled up in front of the assessment sections
  • The assessment sections will be restructured to make them more uniform in presentation. Specifically each assessment will be structured as follows:
    • Introduction and example
    • Technical Value
    • User Value
    • Archive Value
    • Use case support
  • Use case 4 (scientific identity) will be fleshed out and will add some of the material from the email Bruce sent out on the topic
  • Curt Tilmes' FOO examples from emails to esip-preserve will be added as supplementary materials

Identifiers testbed

Nancy Hoebelheinrich is out on vacation; but Yuechen gave a summary of his activities.

All of the glacier photo DB records have been loaded in the ESIP mySQL database and sample DOI XML generated for 10 records. Once Nancy approves the XML, the remainder of the DOI's can be generated. Yuechen noted that there is a limit of 5K on the size of an XML request for DOI's which implies that several requests will need to be generated in order to cover the entire data set.

Draft data management principles and practices

After discussion it was agreed:

  • The draft as it stands is OK and is ready for endorsement at the Jan meeting
  • To preserve the IPY Guidelines for posterity, Mark Parsons will move them into the wiki (IPY is over and the website will die soon)
  • Mark will also start two wiki pages for discussing citation guidelines. One for users, one for producers
  • Defining these guidelines will be an activity for the cluster over the next year. The goal would be to go for endorsement within a year of either/both

Provenance paper

Curt noted that he'd distributed a draft of the paper to the esip-preserve mailing list and has received lots of comments. He'll come back to the group after he's had a chance to digest all of them.

AGU preparations

AGU has approved the ESIP request for a workshop on writing a data management plan. It will be held Tuesday at noon. NOAA and Carol Meyer are leading this effort. The goal would be to not only have the workshop; but also to release materials on-line perhaps as a video tutorial.

January meeting preparations

Ruth noted that the meeting theme is Measuring the Value of Earth Science Data and that Carol needed to know whether the cluster thought that they were going to need a room for the entire meeting. The consensus was that yes, even if the sessions were purely devoted to provenance, the cluster could easily fill the entire meeting then some...

Curt offered to lead a session where he walks the cluster through his examples as a way of clarifying some of the issues that need to be resolved.

We agreed that a session where projects provide a summary of what they are doing and the issues they are facing would be good. Towards that end Helen gave a brief overview of their ACCESS project which involves bringing the KARMA provenance system into the AMSR SIPS to build a provenance browser and provide better provenance information for AMSR standard data products. She noted that they were using OPM and having to extend it since it is too general to do a good job for earth science data. The issues they are running up against are:

  • What should the standard way of capturing and displaying provenance be?
  • What information is needed so that users can understand what they are getting?

Bruce offered to discuss two of his papers on data value and it's relationship to uncertainty (~45 min total) that demonstrate that if data uncertainty is improved data value goes up.

Ruth noted that we need to hold a business meeting and make plans for the upcoming year and reiterated her interests in tasks that are concrete, actionable, and doable within a year or less with the kind of resources the cluster can muster.

Ruth will start a page containing all of the suggestions to date and will talk to Curt about how best to organize the provenance related sessions.

From the floor

Bruce will send around a paper he has been working on re scientific identity.