Difference between revisions of "Interagency Data Stewardship/LifeCycle/Jul2009MeetingPlans"

From Earth Science Information Partners (ESIP)
 
(36 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
Please contribute your thoughts and suggestions on our upcoming plans for including Cluster Activities at the upcoming summer ESIP meeting (Santa Barbara, July 2009).
 
Please contribute your thoughts and suggestions on our upcoming plans for including Cluster Activities at the upcoming summer ESIP meeting (Santa Barbara, July 2009).
  
Current thinking is that there would be 4 two-hour sessions deliberately spread over two days to allow for discussion and reflection between sessions.  At this point it isn't entirely clear whether the results would be summarized as a workshop report though they clearly need to be captured, since they will be needed to support future activities.
+
Back to the [[Preservation_and_Stewardship| Preservation and Stewardship Cluster]] home page.
  
The four sessions could be:
+
Please see the [[Interagency_Data_Stewardship/LifeCycle/Jul2009MeetingPlans/Schedule| current schedule]].  Session descriptions, goals, outcomes, and  speakers follow:
*'''Agencies'''
+
 
**The session would start with presentations by representatives of each agency (NASA, NOAA, EPA, USGS, Library of Congress, NARA, etc.) who will be asked to describe their agency's policies and procedures in regards to data stewardship/preservation; to discuss their actual practices in particular where they diverge from policy; to assess where the agency is headed and any future plans in this area; and to suggest areas where joint work might be advantageous
+
 
**The intent would be to understand what is happening in other agencies on this topic, to motivate cross-agency coordination, and to determine topics ripe for joint development/work at the working level
+
*'''Preservation technologies''' to be given during Day 1 as a series of 1.5 hour technical workshops (perhaps a few may be split in half?)
*'''[[Interagency Data Stewardship/LifeCycle/Jul2009MeetingPlans#Standards Session |  Standards]]'''  
+
**The intent of this session is to determine and begin to assess preservation technologies that exist in the market place (both commercial and open source)
 +
**There would be presentations on technologies like Fedora, DSpace, DuraSpace, IRods, NCore, LOCKSS, as well as a variety of workflow related technologies, etc.
 +
**Topics each speaker should cover:
 +
***Purpose of the technology (what aspects of data lifecycle does the technology support)
 +
***Capabilities
 +
***Known Limitations
 +
***Special emphasis given to discussion of how provenance/context is handled
 +
**Suggested speakers
 +
***Fedora and Duraspace - Thornton Staples - '''confirmed'''
 +
***LOCKSS - Vicky Reich - '''confirmed'''
 +
***iRODS - Reagan Moore - '''confirmed'''
 +
***iExperiment - Paolo Missier - '''confirmed'''
 +
 
 +
 
 +
*'''Standards''' - this session could be held in the midst of the Provenance and Context workshop (~2 hours)
 +
**Session Goals
 +
***Standards training
 +
***Raising awareness within the community of the standards that exist in the earth science
 +
***Determine where additional standards work is needed, where agency collaboration can help move things forward, etc.
 
**Presentations would be given on the following topics
 
**Presentations would be given on the following topics
 
***Preservation standards
 
***Preservation standards
***Data formats
+
****OAIS - Have someone knowledgeable about OAIS to explain what it is, how it is being used by NOAA and other agencies.  (Why is it important to use it? Is it a mandatory for agencies to use? If so, who made it mandatory?) 
***Metadata formats
+
***Data formats - Discuss what is important in data formats to ensure long term preservation of data – talk about HDF, HDF-EOS and NetCDF in this context. What about agencies other than NASA and NOAA? What formats do they use? How does one ensure that data stored in HDF/HDF-EOS/NetCDF continue to be readable and understandable 50 years from now? Etc.
***Provenance
+
***Metadata formats – treat similarly to data formats considering metadata standards currently in use (ISO standards, North American Profile, CF-1, COARDS, PREMIS).
**The purpose of this session is both standards training and to raise awareness within the community of the standards that exist in the earth science
+
**Suggested speakers:
*'''Preservation technologies'''  
+
*** OAIS ''(Handle this as a panel discussion following a 10-15 minute overview; total time ~30 minutes)''
**The intent of this session is to determine and begin to assess preservation technologies that exist in the market place (both commercial and open source)
+
****Lou Reich/John Granger (overview) '''confirmed'''
**There would be presentations on technologies like Fedora, DSpace, DuraSpace, IRods, NCore, LOCKSS, etc.
+
****Ken McDonald (NOAA usage) '''confirmed'''
*'''Non-Earth Science disciplines'''
+
****John Moses/Jeanne Behnke (NASA EOSDIS Data Centers' usage) '''confirmed'''
**What are other disciplines doing for preservation/stewardship?  Any lessons to be learned and incorporated into earth science practice?   
+
***Data Formats (total time ~40 minutes)
 +
****Mike Folk - HDF efforts to improve data preservation (15 minutes) - '''confirmed'''
 +
****Russ Rew - NetCDF and data preservation (15 minutes) - '''confirmed'''
 +
****Discussion (10 minutes)- all
 +
***Metadata content and format standards (total time ~50 minutes)
 +
****Ted Habermann - FGDC and ISO standards (15 minutes) - '''confirmed'''
 +
****Siri Jodha Singh Khalsa - NASA ECS Data Model (10 minutes) - '''confirmed'''
 +
****Rebecca Guenther - PREMIS (15 minutes) - '''not confirmed''' - suggests Nancy Hoebelheinrich - Nancy '''confirmed'''
 +
****Discussion (10 minutes) - all
 +
 
 +
 
 +
*'''Rescuing the past''' (~1 hour)
 +
**This session is about people's experiences in dealing with data from the past.  What are the lessons that should be learned for the future?
 +
**Suggested speakers:
 +
***John Moses - NASA (15 min) - '''confirmed'''
 +
***Dennis Wingo (15 min)- '''confirmed'''
 +
***Tom Ross (15 min) - NOAA -  '''confirmed''' - but need to work out teleconferencing capabilities
 +
***Discussion (15 min)
 +
 
 +
 
 +
*'''The View from the Field''' (2 hours)
 +
**What are other disciplines and agencies doing for preservation/stewardship?  How do they deal with databases, collections of files, physical objects, ad-hoc services such as work flows?  How do they deal with provenance?  Any lessons to be learned and incorporated into earth science practice?   
 
**Biology, Astronomy, Medicine, etc. are potential disciplines to be covered
 
**Biology, Astronomy, Medicine, etc. are potential disciplines to be covered
 +
**Suggested speakers (12 min per speaker):
 +
***Bruce Wilson representing Clifford Duke - Ecological Society of America workshops on data sharing.  Link to [[media: ESA_Data_Sharing_for_ESIP_2009-07-08.ppt | Slides]].  Link to [http://www.esa.org/science_resources/datasharing.php  ESA Workshop web site]
 +
***Nirav Merchant - iPlant - '''confirmed'''
 +
***George Djorgovski - National Virtual Observatory/Cal Tech - '''confirmed'''
 +
***Steve Hughes - Planetary Data Systems/JPL - '''confirmed'''
 +
***Reagan Moore - NARA - '''confirmed'''
 +
***Steve Morris - LOC - '''confirmed'''
 +
**Moved the presentations re workflow related technologies to this session:
 +
****Brian Wilson (JPL) - sciflo - '''confirmed'''
 +
****Paolo Missier - iExperiment.org (will also give a 90 min presentation on Tuesday) '''confirmed'''
  
== Standards Session ==
+
*'''Provenance and Context Workshop'''
 
+
The bulk of the time would be spent on a Provenance/Context Workshop (5 hours spread over 2 days) with agenda:
'''Topical outline for discussion at cluster meeting - Santa Barbara, July 2009:'''
+
**Session 1 (2 hours):
 +
**'''Introduction''' - Purpose of the workshop, overview of agenda, process-- Duerr (5min)
 +
***'''Prior work'''
 +
****Summary from Winter Meeting--Raskin (15 min)
 +
****Review of Guiding Documents: OAIS, PREMIS, USGCRP descriptions of provenance/context -- Duerr (30 Min)
 +
***'''Provenance and Context research'''
 +
****3 15 min briefings on some of the research projects in this area and how they fit into overall research agenda
 +
****Speaker suggestions:
 +
*****Jim Frew - '''confirmed'''
 +
*****Bruce Barkstrom - '''confirmed'''
 +
*****Ruth Duerr - Creation of archive information packages '''confirmed'''
 +
**'''Open Discussion (~25 min)'''
 +
***Review what we learned in preceeding sessions
 +
***Determine what's missing
 +
***Plan initial approach to creating final document
  
#OAIS - Have someone knowledgeable about OAIS to explain what it is, how it is being used by NOAA and other agencies.  (Why is it important to use it? Is it a mandatory for agencies to use? If so, who made it mandatory?)  
+
**'''Session 2 (1 hour)'''
#Data Formats - Discuss what is important in data formats to ensure long term preservation of data – talk about HDF, HDF-EOS and NetCDF in this context. What about agencies other than NASA and NOAA? What formats do they use? How does one ensure that data stored in HDF/HDF-EOS/NetCDF continue to be readable and understandable 50 years from now? Etc.
+
***Review prior days plan. New ideas? Modifications.
#Metadata Formats – treat similarly to 2 considering metadata standards currently in use (ISO standards, North American Profile, CF-1, COARDS, PREMIS).
+
***Finalize [[Interagency_Data_Stewardship/LifeCycle/Jul2009MeetingPlans/ProvenanceReport|outline for a report]] with a recommended Research Agenda and Short-term Action Plan
#Provenance Standards – have someone knowledgeable discuss state of the art. Should there be a common set of requirements to preserve provenance?
+
***Break into small writing teams
  
I invite everyone to look at this outline and comment. Also, either volunteers or recommendations for speakers to cover these areas would be most welcome.
+
**''' Session 3 (2 hours)'''
 +
***Writing teams continue for first hour
 +
***Reconvene and determine plan for finalizing document
  
Depending on the scheduling for the other topic areas to be covered at the meeting, we may have 1.5 to 2 hours for this area. So, 20-30 minutes for each of the four items in the above outline would be the budget.
 
  
[[User:Ramapriyan|Ramapriyan]] 15:38, 5 March 2009 (EST)
+
*'''Cluster Meeting (Friday morning)'''
 +
** Agenda:
 +
*** Moving forward with results from the provenance workshop
 +
**** Unfinished business
 +
**** Any emergent topics?
 +
*** Unique identifiers
 +
**** Develop plan for testbed
 +
***** Identify data sets for test
 +
***** Identify ID schemes to use
 +
***** Define assessment strategy
 +
*** Other activities/topics for the group

Latest revision as of 14:48, July 8, 2009

Please contribute your thoughts and suggestions on our upcoming plans for including Cluster Activities at the upcoming summer ESIP meeting (Santa Barbara, July 2009).

Back to the Preservation and Stewardship Cluster home page.

Please see the current schedule. Session descriptions, goals, outcomes, and speakers follow:


  • Preservation technologies to be given during Day 1 as a series of 1.5 hour technical workshops (perhaps a few may be split in half?)
    • The intent of this session is to determine and begin to assess preservation technologies that exist in the market place (both commercial and open source)
    • There would be presentations on technologies like Fedora, DSpace, DuraSpace, IRods, NCore, LOCKSS, as well as a variety of workflow related technologies, etc.
    • Topics each speaker should cover:
      • Purpose of the technology (what aspects of data lifecycle does the technology support)
      • Capabilities
      • Known Limitations
      • Special emphasis given to discussion of how provenance/context is handled
    • Suggested speakers
      • Fedora and Duraspace - Thornton Staples - confirmed
      • LOCKSS - Vicky Reich - confirmed
      • iRODS - Reagan Moore - confirmed
      • iExperiment - Paolo Missier - confirmed


  • Standards - this session could be held in the midst of the Provenance and Context workshop (~2 hours)
    • Session Goals
      • Standards training
      • Raising awareness within the community of the standards that exist in the earth science
      • Determine where additional standards work is needed, where agency collaboration can help move things forward, etc.
    • Presentations would be given on the following topics
      • Preservation standards
        • OAIS - Have someone knowledgeable about OAIS to explain what it is, how it is being used by NOAA and other agencies. (Why is it important to use it? Is it a mandatory for agencies to use? If so, who made it mandatory?)
      • Data formats - Discuss what is important in data formats to ensure long term preservation of data – talk about HDF, HDF-EOS and NetCDF in this context. What about agencies other than NASA and NOAA? What formats do they use? How does one ensure that data stored in HDF/HDF-EOS/NetCDF continue to be readable and understandable 50 years from now? Etc.
      • Metadata formats – treat similarly to data formats considering metadata standards currently in use (ISO standards, North American Profile, CF-1, COARDS, PREMIS).
    • Suggested speakers:
      • OAIS (Handle this as a panel discussion following a 10-15 minute overview; total time ~30 minutes)
        • Lou Reich/John Granger (overview) confirmed
        • Ken McDonald (NOAA usage) confirmed
        • John Moses/Jeanne Behnke (NASA EOSDIS Data Centers' usage) confirmed
      • Data Formats (total time ~40 minutes)
        • Mike Folk - HDF efforts to improve data preservation (15 minutes) - confirmed
        • Russ Rew - NetCDF and data preservation (15 minutes) - confirmed
        • Discussion (10 minutes)- all
      • Metadata content and format standards (total time ~50 minutes)
        • Ted Habermann - FGDC and ISO standards (15 minutes) - confirmed
        • Siri Jodha Singh Khalsa - NASA ECS Data Model (10 minutes) - confirmed
        • Rebecca Guenther - PREMIS (15 minutes) - not confirmed - suggests Nancy Hoebelheinrich - Nancy confirmed
        • Discussion (10 minutes) - all


  • Rescuing the past (~1 hour)
    • This session is about people's experiences in dealing with data from the past. What are the lessons that should be learned for the future?
    • Suggested speakers:
      • John Moses - NASA (15 min) - confirmed
      • Dennis Wingo (15 min)- confirmed
      • Tom Ross (15 min) - NOAA - confirmed - but need to work out teleconferencing capabilities
      • Discussion (15 min)


  • The View from the Field (2 hours)
    • What are other disciplines and agencies doing for preservation/stewardship? How do they deal with databases, collections of files, physical objects, ad-hoc services such as work flows? How do they deal with provenance? Any lessons to be learned and incorporated into earth science practice?
    • Biology, Astronomy, Medicine, etc. are potential disciplines to be covered
    • Suggested speakers (12 min per speaker):
      • Bruce Wilson representing Clifford Duke - Ecological Society of America workshops on data sharing. Link to Slides. Link to ESA Workshop web site
      • Nirav Merchant - iPlant - confirmed
      • George Djorgovski - National Virtual Observatory/Cal Tech - confirmed
      • Steve Hughes - Planetary Data Systems/JPL - confirmed
      • Reagan Moore - NARA - confirmed
      • Steve Morris - LOC - confirmed
    • Moved the presentations re workflow related technologies to this session:
        • Brian Wilson (JPL) - sciflo - confirmed
        • Paolo Missier - iExperiment.org (will also give a 90 min presentation on Tuesday) confirmed
  • Provenance and Context Workshop

The bulk of the time would be spent on a Provenance/Context Workshop (5 hours spread over 2 days) with agenda:

    • Session 1 (2 hours):
    • Introduction - Purpose of the workshop, overview of agenda, process-- Duerr (5min)
      • Prior work
        • Summary from Winter Meeting--Raskin (15 min)
        • Review of Guiding Documents: OAIS, PREMIS, USGCRP descriptions of provenance/context -- Duerr (30 Min)
      • Provenance and Context research
        • 3 15 min briefings on some of the research projects in this area and how they fit into overall research agenda
        • Speaker suggestions:
          • Jim Frew - confirmed
          • Bruce Barkstrom - confirmed
          • Ruth Duerr - Creation of archive information packages confirmed
    • Open Discussion (~25 min)
      • Review what we learned in preceeding sessions
      • Determine what's missing
      • Plan initial approach to creating final document
    • Session 2 (1 hour)
      • Review prior days plan. New ideas? Modifications.
      • Finalize outline for a report with a recommended Research Agenda and Short-term Action Plan
      • Break into small writing teams
    • Session 3 (2 hours)
      • Writing teams continue for first hour
      • Reconvene and determine plan for finalizing document


  • Cluster Meeting (Friday morning)
    • Agenda:
      • Moving forward with results from the provenance workshop
        • Unfinished business
        • Any emergent topics?
      • Unique identifiers
        • Develop plan for testbed
          • Identify data sets for test
          • Identify ID schemes to use
          • Define assessment strategy
      • Other activities/topics for the group