Draft Agenda - Data Stewardship/Life Cycle Track at the Jan 2010 Meeting

NOTE: All items are tentative unless marked as confirmed. Also, contributions to the agenda are welcome.

Tuesday, January 5, 2010

Session Time Session Details Room
1:30 - 3:00 National Perspectives
IWGDD - Chris Greer
LOC - William Lefurgy
NARA - Laurence Brewer (Canceled at the last minute due to death in the family)

Rounding out and completing the series of agency presentations given over the last two ESIP federation meetings are presentations from the Library of Congress, the National Archives and Records Administration, and the Interagency Working Group on Digital Data. Following the presentations a panel discussion with the speakers will be held to explore options for moving the ESIP Federation Vision "to be a leader in promoting the collection, stewardship, and use of Earth science data, information and knowledge" forward

3:00 - 3:15 Break
3:15 - 4:45 Panel (future collaborations of Federation with IWGDD/LOC/NARA) Corcoran

Wednesday, January 6, 2010

Session Time Session Details Room
2:00 - 3:30 A Proposed Short Course on Data Stewardship - Scott Hausman (NOAA/NESDIS/NCDC)

As the volume of digitized data grows almost exponentially and the number of critical decisions enabled by automated systems expands into every sector of society, it is clear that the demand for trained data stewards has never been greater. Despite the compelling need, opportunities for training are minimal, and few academic institutions offer courses focused on data stewardship. The need is especially compelling for research scientists, who can expand their analysis opportunities and ultimately the quality of their results by simply following basic data stewardship principles. To help address this specific need, during the Summer of 2011, ESIP, NOAA/NESDIS, and the Cooperative Institute for Climate and Satellites (CICS) have agreed to co-host a 1-2 week short course on the fundamentals of data stewardship for research scientists. During this session, we will engage in a moderated discussion to explore course objectives, hands-on opportunities, potential instructors, and avenues for broader collaboration and participation.||Corcoran

3:30 - 4:00 Break
4:00 - 5:30 Data Systems and Stewardship Processes
4:00 - 4:30 DMAS architecture - Thomas Huang (PO.DAAC)

The Physical Oceanography Distributed Active Archive Center (PO.DAAC) at the Jet Propulsion Laboratory is the NASA data center responsible for archiving and distributing data relevant to the physical state of the ocean. With the rapid increase in data volume and the expansion of our science user community, the ability to reliably capture science artifacts and promptly make the data products available to our users is driving the next generation of the PO.DAAC data management system. Modern data management systems must be adaptable to diverse science data formats, scalable to meet the mission’s quality of service requirements, and able to manage the life-cycle of a given science product. This talk makes three contributions to the area of modern data archive systems. First, it describes the data archive pipeline developed for PO.DAAC. The Data Management and Archive System (DMAS) is a distributed, transaction-oriented data archive system. Second, it describes the DMAS architecture in the area of metadata capturing, state-driven product life-cycle management, load balancing, and significant event capturing and reporting. Finally, it presents the application of the DMAS for the Group for High-Resolution Sea Surface Temperature (GHRSST) and Advanced SCATterometer (ASCAT) projects.

4:30 - 5:00 NSIDC's Accessioning and De-accessioning Plan - R. Duerr

All data are not created equal. Individual data sets vary from each other in a multitude of ways - from easily measured ways such as size, format, and complexity; to ways that require a more nuanced understanding, such as in the breadth and depth of a data set's potential user base, its "designated community" to use the terminology of the Reference Model for an Open Archival Information System. Given limitations in the resources available, it is not surprising then that repositories, such as those of the National Snow and Ice Data Center, need to make choices about the level of support or services provided for each data set acquired and that these choices might change over time. Such choices, prioritize center activities. While such decisions have been an implicit part of NSIDC activities for many years, an effort was recently taken to explicitly define the Levels of Service supported at NSIDC. Baseline Levels of Service are being defined for all existing NSIDC data sets. In addition, Levels of Service considerations are a major component of the NSIDC Distributed Active Archive Center's (DAAC's) new data acquisition and decommissioning processes. In this talk we describe the Levels of Service currently supported at NSIDC and factors that affect the effort required to obtain a given level of service. We also discuss the process users should use if they wish to request that the NSIDC DAAC archive their data and preliminary plans for decommissioning data.

5:00 - 5:30 SEDAC Long-Term Archive - Bob Downs

The development of a Long-Term archive for interdisciplinary scientific data can improve capabilities for preparing data for future use and for enabling access to digital data by communities of users in the future. The establishment of the SEDAC Long-Term Archive (LTA) is described along with efforts to develop the LTA to meet requirements specified in the Open Archive Information Systems (OAIS) Framework. Planning for sustainable governance and management are discussed along with criteria for data appraisal and levels of service offered by the LTA.


Thursday, January 7, 2010

Session Time Session Details Room
8:30 - 9:30 Data Preservation/Stewardship Cluster Meeting -
10:00 - 12:00 Joint session with Environmental Decision Making track) Phillips/Corcoran
12:00 - 1:30 Lunch
1:30 - 3:00 ISO Metadata - Ted Habermann

The last year has seen significant uptake of the ISO Metadata Standards in the U.S. and global environmental data community. I will show examples of ISO 19115-2 for gridded coverages (including integration with grids in netCDF), granules and collections, and observing system metadata and discuss how information about these standards is being shared on a wiki ( I will also talk about expected revisions to these standards. Of course, I will also try to answer questions about ISO raised by the audience!

3:00 Adjourn