Difference between revisions of "Preservation Use Case Choosing a dataset"
(→Notes) |
|||
Line 48: | Line 48: | ||
==Notes== | ==Notes== | ||
− | Relevant experts may include: | + | *Relevant experts may include: |
− | science domain experts that know the science applications. | + | **science domain experts that know the science applications. |
− | instrument experts that know the subtleties of the observation mechanism. | + | **instrument experts that know the subtleties of the observation mechanism. |
− | algorithm experts that know the variations in retrievals. | + | **algorithm experts that know the variations in retrievals. |
− | process experts that know the subtleties of the processing implementation. | + | **process experts that know the subtleties of the processing implementation. |
− | data format experts that know handling of for example HDF4 vs HDF5. | + | **data format experts that know handling of for example HDF4 vs HDF5. |
− | How best to capture the "gotchas" potentially introduced each step along the way? | + | *How best to capture the "gotchas" potentially introduced each step along the way? |
− | Suitability of data usage | + | *Suitability of data usage |
− | Mapping observations (e.g. variables) to appropriate science focus areas. | + | **Mapping observations (e.g. variables) to appropriate science focus areas. |
Latest revision as of 23:37, December 12, 2013
Choosing a data set from multiple similar choices.
Summary
A research user needs to pick the data set from multiple similar data sets that best meets the user’s requirements for their intended application. An example could be a polar bear ecologist choosing a data set on sea ice conditions in a region of the Hudson Bay from the multiple data sets listed at NSIDC. Another example could be a user choosing which sea surface temperature data set from PO.DAAC to use in forcing a model of an agal bloom. Many other examples exist. Traditionally this was done by the user consulting a relevant expert. Ideally, one could conceive of an expert system helping guide the user through their query, if the system had access to sufficient information.
Relevant experts may include:
- science domain experts that know the science applications.
- instrument experts that know the subtleties of the observation mechanism.
- algorithm experts that know the variations in retrievals.
- process experts that know the subtleties of the processing implementation.
- data format experts that know handling of for example HDF4 vs HDF5.
How best to capture the "gotchas" potentially introduced each step along the way?
Suitability of data usage
- Mapping observations (e.g. variables) to appropriate science focus areas.
Actors
- Research user
- Data expert(s)/Expert system
- science domain experts that know the science applications.
- instrument experts that know the subtleties of the observation mechanism.
- algorithm experts that know the variations in retrievals.
- process experts that know the subtleties of the processing implementation.
- data format experts that know handling of for example HDF4 vs HDF5.
- Archive
Sequence of Events
- User poses initial request to expert
- Expert queries user on specifics
- Iteration between user and expert to understand vocabularies and actual needs
- Initial possible data sets are identified by basic criteria like whether the data set covers the right time and location
- The list is further refined by more qualitative criteria specific to the actual query
- A recommended data set or ranked list of data sets is returned to the user
PCCS Artifacts
- Data usage information
- Informal feedback from users (e.g., Amazon-style comments)
- publications about the data
- Publications that use the data
- Data “peer review” information. This is ill defined, but could include
- audit information about practices and processes to produce and maintain the data
- Advise from scientific advisory groups, etc.
- ….
- Authority or certification information
- Who is the authority (if there is one) that is ascerting hat the data meet certain quality criteria (e.g. Nat. Weather Service)
- Criteria used in the certification
Notes
- Relevant experts may include:
- science domain experts that know the science applications.
- instrument experts that know the subtleties of the observation mechanism.
- algorithm experts that know the variations in retrievals.
- process experts that know the subtleties of the processing implementation.
- data format experts that know handling of for example HDF4 vs HDF5.
- How best to capture the "gotchas" potentially introduced each step along the way?
- Suitability of data usage
- Mapping observations (e.g. variables) to appropriate science focus areas.