Data Management Course Outline

From Earth Science Information Partners (ESIP)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

NOTE: We agreed that the target audience initially would be scientists

For Scientists

The case for data stewardship

  • Agency requirements
    • NSF data management plan
    • NASA science data policy
    • NOAA Administrative Order 212-15, Management of Environmental and Geospatial Data and Information
  • Return on Investment
    • Return on your investment
    • Expanding the audience for your data
    • Return on public investments
  • Verifiable science
    • Tying your data to standards, metrics, and benchmarks
  • Facilitating science through interoperable discovery and access
  • Enhancing your reputation
  • Preserving the Scientific Record
    • Establishing Relationships with archives
    • Preserving a Record of Environmental Change
    • Other case studies?
  • What Not to do when Archiving Data!

Data Management plans

  • Why do a data management plan?
  • Elements of a plan -
    • Identify materials to be created
    • Identify your audience(s)
    • Data organization
    • Roles and responsibilities
    • Describing and documenting your data, including metadata
    • Standards used
    • Data access, sharing, and re-use policies
    • Backups, archives, and preservation strategy
    • ??QUESTION: Should the plan define (an) objective metric(s) to make implementation and compliance measurable?
  • Estimating effort and resources required
    • Hardware, software capabilities required
    • Personnel resources and skills needed
  • Some available resources to help with developing your plan

Local Data Management

  • Managing your data
    • Data identifiers and locators - Jeff Arnfield/NCDC
    • File naming conventions (Cook)
    • Backing up your data (Cook)
    • Developing a citation for your data (Cook)
    • Recording provenance and context - Jeff Arnfield/NCDC
    • Tracking and describing changes to the data
    • QUESTIONS
      • Citation, provenance and context are also documentation/metadata activities. Should they be grouped there instead?
  • Data Formats
    • Building understandable spreadsheets - Jeff Arnfield/NCDC
    • Using self-describing data formats
    • Choosing and adopting community accepted standards
    • Avoiding proprietary formats
  • Creating metadata
    • For your collections as a whole
    • Creating item level metadata
    • Metadata for discovery - Tyler Stevens/GCMD
    • Metadata for access and use - Jeff Arnfield/NCDC
    • Metadata for archiving - Jeff Arnfield/NCDC
    • Metadata for tracking data processing
    • Publishing metadata to GCMD - Tyler Stevens/GCMD
    • Publishing metadata to ECHO
    • QUESTIONS
      • Is "documentation" a friendlier, and more inclusive, term?
      • The "publishing" items are most closely related to advertising/accessing data -- should they be moved there?
  • Working with your archive organization
    • Broadening your user community
    • Planning for longer term preservation - Jeff Arnfield/NCDC
  • Providing access to your data
    • Evaluating who your audience is
    • Who gets to access your data
      • Agency best practices & policies
    • Access mechanisms
    • Advertising your data (i.e., data casting)
    • Tracking data usage
    • Handling sensitive data
    • Rights
    • QUESTIONS
      • Should "advertising your data" and "providing access" be separate sections or subsections?
      • Need to address portals and registries beyond GCMD & ECHO. Some agencies have specific requirements for publishing metadata.

Preservation strategies

  • Sponsor (e.g., Agency) or institution requirements
  • Options for archiving your data
    • What archives are out there?
      • Discipline or institutional archives
      • Finding an archive
    • What to do if there is no archive out there
  • What data goes into a Long-term archive?
  • What do long term archives do with my data? - Jeff Arnfield/NCDC
  • Data transfer & submission agreements
    • See "Submission Agreements" section under "For Data Managers"
    • Agency/archive specific requirements my vary
  • Intro to the OAIS Reference Model
  • Emerging standards for preservation
  • Metadata

Responsible Data Use

  • Citation and credit
  • Data restrictions
  • Fair use
  • Feedback and metrics
  • Collaboration
  • Community participation

For Data Managers

  • Data Management plan support
  • Collection or acquisition policies
  • Intro to OAIS reference model
  • Initial Assessment and appraisal
    • Identify information to be preserved
      • main features and properties
      • dependencies on information here or elsewhere
    • Identify objects to be received
    • Establish complementary information needs (e.g., format, data descriptions, provenance, reference information, context, fixity information)
      • What complementary information is needed for data useful for climate studies (USGCRP list)
    • Assessing potential designated communities
    • Assessing probable curation duration
    • Assessing data transfer options
    • Defining access paths
    • Assessing costs and feasibility
    • Metadata, metadata standards, and levels of metadata
  • Submission agreements
    • Data integrity
    • Contacts
    • Schedule
    • Operational Procedures
    • Error reconciliation
    • Constraints
    • other aspects necessary for understanding how to support the data
  • Preparing for ingest
  • Ingesting data
    • Validation checks
    • Identifiers
    • Citations
    • Levels of service
  • Periodic re-assessment
  • Curation activities
    • Media migration
    • Format migration