Preservation Use Case Obtsaining Data

From Earth Science Information Partners (ESIP)

Obtaining Data

This use case involves all of the ways users obtain data. The range of mechanisms considered range from ad-hoc methods such as consulting your colleagues at a meeting, finding out about a data set and contacting the scientist who has it to get it, to using one of the major data centers systems (or GCMD/ECHO) to find and assess and obtain relevant data sets.

Actors

  • the researcher who need data
  • data producers/scientists
  • data curators who may be
    • graduate student who takes care of the data during a project
    • data repository personnel
    • IT folks assigned to manage the storage
    • software systems and analysis tools
    • hardware systems like FTP sites
  • QA people and systems
  • Google and other search engines
  • reputable scientists or colleagues
  • Data recommenders
  • Citation indices for data
  • advertisements such as data casts, etc.
  • Discovery tools

Sequence of Events

The general consensus was that no matter the route to actually obtaining the data, the steps involved are always the same:

  • data must be made available
  • data must be advertised
  • data must be accessible
  • must be a feedback mechanism

PCCS artifacts

Note: Artifacts with an * are only needed if some issue with the data is observed

  • the data itself
  • processing version
  • source code *
  • Format
  • Size of data
  • parameter descriptions
  • content descriptions
  • tools and web apps needed to read/use/transform the data
  • reputation of source
  • validation status
  • calibration method
  • processing method
  • algorithms used *
  • difference from previous versions *
  • data version
  • processing history *
  • history of what has happened to the data since it was created *
  • data inputs used *
  • pre-cursor products used *
  • instrument schematics, etc. *
  • instrument capabilities and characteristics
  • calibration/validation data *
  • validation method *