Data Management Course Outline

From Earth Science Information Partners (ESIP)

Back to the main data management training page

NOTE: We agreed that the target audience initially would be scientists

Caution!!!!

All of the modules on this site are draft materials only! They are made available here so that interested parties can see what is in development and have the chance to comment. Once modules have completed the peer and editorial review process, they will be moved to the ESIP Information Commons and placed under revision control.

Module template and Author Guidelines

For Scientists

The case for data stewardship

Data Management plans

  • Why do a data management plan? - Ruth
  • Elements of a plan - Ruth
    • Identify materials to be created - Ruth
    • Identify your audience(s) - Ruth
    • Data organization - Ruth
    • Roles and responsibilities - Ruth
    • Describing and documenting your data, including metadata - Ruth
    • Standards used - Ruth
    • Data access, sharing, and re-use policies - Ruth
    • Backups, archives, and preservation strategy - Ruth
  • Estimating effort and resources required - Ruth
    • Hardware, software capabilities required - Ruth
    • Personnel resources and skills needed - Ruth
  • Some available resources to help with developing your plan - Ruth

Local Data Management

  • Managing your data - Ruth
    • Data identifiers and locators
    • File naming conventions Bob Cook/ORNL
    • Backing up your data Bob Cook/ORNL
    • Write it down! Maintaining contemporaneous documentation
      • Who, what, when, where, why, how
      • Tracking and describing changes to the data
      • Lab-based approaches to Data Management - Lynn Yarmey/NSIDC
      • Documenting the 'messy' sciences - Lynn Yarmey/NSIDC
  • Data Formats - Ruth
  • Creating documentation and metadata
    • Developing a citation for your data Bob Cook/ORNL
    • Recording provenance and context - Jeff Arnfield/NCDC
    • For your collections as a whole
    • Creating item level metadata
    • Metadata for discovery - Tyler Stevens/GCMD
    • Metadata for access and use - Jeff Arnfield/NCDC
    • Metadata for archiving - Jeff Arnfield/NCDC
    • Metadata for tracking data processing
    • Individual agencies, archives and registries may have specific requirements
    • Introduction to Standards - Lynn Yarmey/NSIDC
    • Introduction to XML - Lynn Yarmey/NSIDC
  • Working with your archive organization - Ron Weaver/NSIDC
    • Planning for longer term preservation - Jeff Arnfield/NCDC
    • Work with your archive early and often - Jeff Arnfield/NCDC
    • Broadening your user community - Bob Downs
  • Advertising your data
    • Agency/institution requirements for publishing metadata
    • Journals and publications
    • Agency/institution web sites
    • Using portals and registries
      • Publishing metadata to a Web Accessible Folder
      • Publishing metadata to GCMD - Tyler Stevens/GCMD
      • Publishing metadata to ECHO
      • Publishing metadata to Data.Gov
      • NOTE: Need to address additional portals and registries beyond GCMD & ECHO. Add other entries as appropriate
    • Datacasting
  • Providing access to your data - Bob Downs/Chris Lenhardt/Ron Weaver (whole section); Rama has volunteered to review this section
  • Additional Products
    • Additional resources: What you already have that others can use - Lynn Yarmey/NSIDC
    • Writing Sharable Code - Lynn Yarmey/NSIDC
    • Sharing vocabularies - Lynn Yarmey/NSIDC

Preservation strategies

I have added draft sections below, the references need work -Ron Weaver

Responsible Data Use

For Data Managers

  • Data Management plan support
  • Collection or acquisition policies
  • Intro to OAIS reference model
  • Initial Assessment and appraisal
    • Identify information to be preserved
      • main features and properties
      • dependencies on information here or elsewhere
    • Identify objects to be received
    • Establish complementary information needs (e.g., format, data descriptions, provenance, reference information, context, fixity information)
      • What complementary information is needed for data useful for climate studies (USGCRP list)
    • Assessing potential designated communities
    • Assessing probable curation duration
    • Assessing data transfer options
    • Defining access paths
    • Assessing costs and feasibility
    • Metadata, metadata standards, and levels of metadata
  • Submission agreements
    • Data integrity
    • Contacts
    • Schedule
    • Operational Procedures
    • Error reconciliation
    • Constraints
    • other aspects necessary for understanding how to support the data
  • Preparing for ingest
  • Ingesting data
    • Validation checks
    • Identifiers
    • Citations
    • Levels of service
  • Periodic re-assessment
  • Curation activities
    • Media migration
    • Format migration