Difference between revisions of "Data Management Course Outline"

From Earth Science Information Partners (ESIP)
 
(150 intermediate revisions by 14 users not shown)
Line 1: Line 1:
 +
[http://wiki.esipfed.org/index.php/Data_Management_Short_Course Back to the main data management training page]
 +
 
'''NOTE''':  ''We agreed that the target audience initially would be scientists''
 
'''NOTE''':  ''We agreed that the target audience initially would be scientists''
 +
 +
== Caution!!!! ==
 +
 +
All of the modules on this site are draft materials only!  They are made available here so that interested parties can see what is in development and have the chance to comment.  Once modules have completed the peer and editorial review process, they will be moved to the ESIP Information Commons and placed under revision control.
 +
 +
== Module template and Author Guidelines ==
 +
 +
* [[Media:DataShortCourseModuleTemplate.pptx | Data management training module template (ppt) (Updated - Now with Voiceover scripts & revised Refs/Resources slides)]]
 +
* [[AuthorsGuide | Author Guidelines ]]
 +
* [[Module Review Criteria]]
  
 
== For Scientists ==
 
== For Scientists ==
Line 5: Line 17:
 
===The case for data stewardship===
 
===The case for data stewardship===
  
* Agency requirements  
+
* [[Media:AgencyRequirementsV2.ppt‎ | Agency requirements ]]- Ruth
**NSF data management plan
+
**[[Media:NSFDataManagementPlans.pptx‎ | NSF data management plans]] - Ruth
**NASA science data policy
+
**[[Media:CDSAgencyRequirementsNASADataPolicy_REV02.ppt |NASA science data policy]] - Ron Weaver
**NOAA Administrative Order 212-15, Management of Environmental and Geospatial Data and Information
+
**[[Media:DMSC_Agency_NOAA-Administrative-Order_212-15_V1.0.ppt | NOAA Administrative Order 212-15, Management of Environmental and Geospatial Data and Information]] - Jeff Arnfield/NCDC
* Return on Investment
+
* [[Media:DataShortCourseModule_ROI_1.ppt‎ | Return on Investment]] - Erin/Carol
**Return on your investment  
+
**Return on Scientist's investment - Ruth
**Expanding the audience for your data
+
**Return on the public's investment - Ruth
**Return on public investments
+
**Verifiable science - Tying your data to standards, metrics, and benchmarks
* Verifiable science
+
* Facilitating science through interoperable discovery and access - Jeff Arnfield/NCDC
**Tying your data to standards, metrics, and benchmarks
+
* [[Media:CaseforDataStewardshipEnhancingRepMayernik_update.pptx| Enhancing your reputation]] - Matt Mayernik/NCAR
* Facilitating science through interoperable discovery and access
+
* [[Media:CaseforDataStewardshipPreservingSciRecordMayernik_update.pptx‎| Preserving the Scientific Record]] - Matt
* Enhancing your reputation
+
** [[Media:PreservingScientificRecordArchiveRelationsMayernik_updated.pptx| Establishing Relationships with archives]] - Matt
* Preserving the Scientific Record
+
** [[Media:PreservingScientificRecordEnvChangeMayernik_update.pptx‎| Preserving a Record of Environmental Change]] - Matt
**Establishing Relationships with archives
+
** [[Media:PreservingScientificRecordCaseStudy1Mayernik_updated.pptx| Case Study 1 - NSIDC Glacier Photos]] - Matt
**Preserving a Record of Environmental Change  
+
** [[Media:PreservingScientificRecordCaseStudy2Mayernik_update.pptx‎| Case Study 2 - Arctic Temperature Variability Data]] - Matt
**Other case studies?
+
* What Not to do when Archiving Data!
* What Not to do when Archiving Data!  
 
  
 
===Data Management plans===
 
===Data Management plans===
  
*Why do a data management plan?
+
*[[Media:WhyDoADataManagementPlan.ppt‎x | Why do a data management plan?]] - Ruth
*Elements of a plan -  
+
*[[Media:ElementsOfDataManagementPlanV1.ppt‎x | Elements of a plan]] - Ruth
**Identify materials to be created
+
**[[Media:DMP-IdentifyingMaterialsToBeCreated.pptx | Identify materials to be created]] - Ruth
**Identify your audience(s)
+
**[[Media:DMP-OrganizationAndStandards.ppt | Organization and standards]] - Ruth
**Data organization
+
**[[Media:DMP-RolesNResponsibilities.ppt | Roles and responsibilities]] - Ruth
**Roles and responsibilities
+
**[[Media:DMP-AccessSharingReusePolicies.ppt | Data access, sharing, and re-use policies]] - Ruth
**Describing and documenting your data, including metadata
+
**[[Media:DMP-BackupsArchivingPreservation.ppt | Backups, archives, and preservation strategy]] - Ruth
**Standards used
+
*Estimating effort and resources required - Ruth
**Data access, sharing, and re-use policies
+
*Some available resources to help with developing your plan - Ruth
**Backups, archives, and preservation strategy
 
**??QUESTION: Should the plan define (an) objective metric(s) to make implementation and compliance measurable?
 
*Estimating effort and resources required
 
**Hardware, software capabilities required
 
**Personnel resources and skills needed
 
*Some available resources to help with developing your plan
 
  
 
===Local Data Management ===
 
===Local Data Management ===
  
*Managing your data
+
*Managing your data - Ruth
**Data identifiers and locators - Jeff Arnfield/NCDC
+
**Data identifiers and locators
**File naming conventions (Cook)
+
**[[Media:FileNamingModuleV1.ppt | Assign Descriptive File Names]] Bob Cook/ORNL
**Backing up your data (Cook)
+
**[[Media:DataBackupV1.ppt | Backing up your data ]] Bob Cook/ORNL
**Developing a citation for your data (Cook)
+
**Write it down! Maintaining contemporaneous documentation
**Recording provenance and context - Jeff Arnfield/NCDC
+
***Who, what, when, where, why, how
**Tracking and describing changes to the data
+
***Tracking and describing changes to the data
**'''QUESTIONS'''
+
***[[Media:ESIPmod-DataMngmtInTheLab_20120226_ly.ppt‎ | Lab-based approaches to Data Management]] - Lynn Yarmey/NSIDC
***Citation, provenance and context are also documentation/metadata activities. Should they be grouped there instead?
+
*Data Formats - Ruth
*Data Formats
+
**[[Media:DMSC_AvoidingProprietaryFormats.ppt | Avoiding proprietary formats]] - Al Fleig
**Building understandable spreadsheets - Jeff Arnfield/NCDC
+
**[[Media:ChoosingAndAdoptingCommunityAcceptedStandardsTilmes.ppt | Choosing and adopting community accepted standards]] - Curt Tilmes/NASA
**Using self-describing data formats
+
**[[Media:DMSC_DataFormats_Building-understandable-spreadsheets_V1.0.ppt | Building understandable spreadsheets]] - Jeff Arnfield/NCDC
**Choosing and adopting community accepted standards
+
**[[Media:DMSC_SelfDescribingFormats.ppt | Using self-describing data formats]] - Curt Tilmes/NASA
**Avoiding proprietary formats
+
*Creating documentation and metadata
*Creating metadata
+
**[[Media:ESIPmod-IntroToMetadataAndStandards_20121119_ly.ppt |Introduction to Metadata and Metadata Standards]] - Lynn Yarmey/NSIDC
 +
**[[Media:ESIP_Short_Course_Data_Citation_2012-02-14.ppt‎ | Creating a citation for your data]] Bob Cook/ORNL
 +
**[[Media:DMSC_Metadata_Recording-provenance-and-context_V1.0.ppt | Recording provenance and context]] - Jeff Arnfield/NCDC
 
**For your collections as a whole
 
**For your collections as a whole
 
**Creating item level metadata
 
**Creating item level metadata
**Metadata for discovery - Tyler Stevens/GCMD
+
**[[Media:GCMD Metadata Discovery Comments.pptx |Metadata for discovery]] - Tyler Stevens/GCMD
 
**Metadata for access and use - Jeff Arnfield/NCDC
 
**Metadata for access and use - Jeff Arnfield/NCDC
 
**Metadata for archiving - Jeff Arnfield/NCDC
 
**Metadata for archiving - Jeff Arnfield/NCDC
 
**Metadata for tracking data processing
 
**Metadata for tracking data processing
**Publishing metadata to GCMD - Tyler Stevens/GCMD
+
**Individual agencies, archives and registries may have specific requirements
**Publishing metadata to ECHO
+
*Working with your archive - Ron Weaver/NSIDC
**'''QUESTIONS'''
 
*** Is "documentation" a friendlier, and more inclusive, term?
 
*** The "publishing" items are most closely related to advertising/accessing data -- should they be moved there?
 
*Working with your archive organization
 
**Broadening your user community
 
 
**Planning for longer term preservation - Jeff Arnfield/NCDC
 
**Planning for longer term preservation - Jeff Arnfield/NCDC
*Providing access to your data
+
**Work with your archive early and often - Jeff Arnfield/NCDC
**Evaluating who your audience is
+
**[[Media:DataShortCourseModule-BroadeningUserCommunityV1.pptx | Broadening your user community]] - Bob Downs
 +
*[[Media:LocalDataManagementAdvertisingYourDataHoebelheinrich_final.pptx | Advertising your data]] - Nancy Hoebelheinrich/Knowledge Motifs
 +
**[[Media:AdvertisingYourDataAgencyReqsHoebelheinrich_final.pptx | Agency requirements for submitting metadata ]]- Nancy Hoebelheinrich/Knowledge Motifs
 +
**Journals and publications
 +
**Agency/institution web sites
 +
**[[Media:AdvertisingYourDataUsingPortalsRegistriesHoebelheinrich_final.pptx | Using data portals and metadata registries]]- Nancy Hoebelheinrich/Knowledge Motifs
 +
***Publishing metadata to a Web Accessible Folder
 +
***[[Media:GCMD Metadata Publish Comments.pptx |Submitting metadata to GCMD]] - Tyler Stevens/GCMD
 +
***Publishing metadata to ECHO
 +
***Publishing metadata to Data.Gov
 +
***'''NOTE:''' Need to address additional portals and registries beyond GCMD & ECHO. Add other entries as appropriate
 +
**Casting your data - Ruth Duerr
 +
*[[Media:ProvidingAccesstoYourDataMayernik.pptx‎ | Providing access to your data]] - Matt Mayernik/NCAR ; Rama has volunteered to review this section
 +
**[[Media:DataShortCourseModule-DeterminingAudienceV1.pptx | Determining your audience]] - Bob Downs
 
**Who gets to access your data
 
**Who gets to access your data
 
***Agency best practices & policies
 
***Agency best practices & policies
**Access mechanisms
+
**[[Media:DataShortCourseModule-AccessMechanismsV1.pptx | Access mechanisms]] - Bob Downs
**Advertising your data (i.e., data casting)
+
**[[Media:DataShortCourseModule-TrackingDataUsageV1.pptx | Tracking data usage]]  - Bob Downs
**Tracking data usage
+
**[[Media:DataShortCourseModule-HandlingSensitiveDataV1.pptx | Handling Sensitive Data]] - Bob Downs
**Handling sensitive data
+
**[[Media:DataShortCourseModule-RightsV1.pptx | Rights]] - Bob Downs
**Rights
+
*Additional Products
**'''QUESTIONS'''
+
**[[Media:ESIPmod-WritingSharableCode_20120226_ly.ppt | Writing Sharable Code]] - Lynn Yarmey/NSIDC
***Should "advertising your data" and "providing access" be separate sections or subsections? 
 
***Need to address portals and registries beyond GCMD & ECHO. Some agencies have specific requirements for publishing metadata.
 
  
 
===Preservation strategies===
 
===Preservation strategies===
  
*Sponsor (e.g., Agency) or institution requirements
+
''I have added draft sections below, the references need work -Ron Weaver''
*Options for archiving your data
+
 
**What archives are out there?
+
*[[Media:PS1_SponsorRequirements.pptx| Sponsor (e.g., Agency) or institution requirements]] - Ron Weaver /NSIDC
***Discipline or institutional archives
+
*[[Media:PS2_OptionsForArchiving.pptx | Options for archiving your data]] - Ron Weaver/NSIDC
***Finding an archive
+
**What archives are out there? - Ron Weaver/NSIDC (part of above)
**What to do if there is no archive out there
+
***Discipline or institutional archives (part of above)
*What data goes into a Long-term archive?  
+
***Finding an archive (part of above)
*What do long term archives do with my data?  - Jeff Arnfield/NCDC
+
**What to do if there is no archive out there - Ron Weaver/NSIDC (part of above)
*Data transfer & submission agreements  
+
*[[Media:PS3_WhatIsInLTA.ppt‎  | What data goes into a Long-term archive?]] - Ron Weaver/NSIDC
 +
*[[Media:DMSC_Preservation_What-do-long-term-archives-do-with-my-data.ppt | What do long term archives do with my data?]] - Jeff Arnfield/NCDC
 +
*[[Media:PS5_TransferAgreements.ppt | Data transfer & submission agreements]] - Ron Weaver/NSIDC
 
** See "Submission Agreements" section under "For Data Managers"
 
** See "Submission Agreements" section under "For Data Managers"
 
** Agency/archive specific requirements my vary
 
** Agency/archive specific requirements my vary
*Intro to the OAIS Reference Model
+
*[[Media:DMSC_IntroOAISRefModel.ppt | Intro to the OAIS Reference Model]] - Curt Tilmes
*Emerging standards for preservation
+
* [[Media:PS7_EmergingStandards.ppt| Emerging standards for preservation]] - Ron Weaver/NSIDC
 
*Metadata
 
*Metadata
  
 
=== Responsible Data Use ===
 
=== Responsible Data Use ===
  
*Citation and credit
+
*[[Media:ResponsibleDataUseCitationAndCreditMayernik.pptx‎ | Citation and credit]] - Matt Mayernik/NCAR
*Data restrictions
+
*[[Media:DataShortCourseModule-DataRestrictionsV1.pptx‎ | Data restrictions]] - Bob Downs
*Fair use
+
*[[Media:ResponsibleDataUseCopyrightAndDataMayernik.pptx‎ | Copyright and Data]] - Matt
*Feedback and metrics
+
*[[Media:RDU-Feedback.ppt | Providing Feedback]] - Ruth
 
*Collaboration
 
*Collaboration
 
*Community participation
 
*Community participation

Latest revision as of 10:11, January 24, 2013

Back to the main data management training page

NOTE: We agreed that the target audience initially would be scientists

Caution!!!!

All of the modules on this site are draft materials only! They are made available here so that interested parties can see what is in development and have the chance to comment. Once modules have completed the peer and editorial review process, they will be moved to the ESIP Information Commons and placed under revision control.

Module template and Author Guidelines

For Scientists

The case for data stewardship

Data Management plans

Local Data Management

Preservation strategies

I have added draft sections below, the references need work -Ron Weaver

Responsible Data Use

For Data Managers

  • Data Management plan support
  • Collection or acquisition policies
  • Intro to OAIS reference model
  • Initial Assessment and appraisal
    • Identify information to be preserved
      • main features and properties
      • dependencies on information here or elsewhere
    • Identify objects to be received
    • Establish complementary information needs (e.g., format, data descriptions, provenance, reference information, context, fixity information)
      • What complementary information is needed for data useful for climate studies (USGCRP list)
    • Assessing potential designated communities
    • Assessing probable curation duration
    • Assessing data transfer options
    • Defining access paths
    • Assessing costs and feasibility
    • Metadata, metadata standards, and levels of metadata
  • Submission agreements
    • Data integrity
    • Contacts
    • Schedule
    • Operational Procedures
    • Error reconciliation
    • Constraints
    • other aspects necessary for understanding how to support the data
  • Preparing for ingest
  • Ingesting data
    • Validation checks
    • Identifiers
    • Citations
    • Levels of service
  • Periodic re-assessment
  • Curation activities
    • Media migration
    • Format migration