Difference between revisions of "Preservation Use Case Choosing a dataset"
(→Notes) |
|||
(2 intermediate revisions by 2 users not shown) | |||
Line 19: | Line 19: | ||
* Research user | * Research user | ||
* Data expert(s)/Expert system | * Data expert(s)/Expert system | ||
+ | ** '''science domain experts''' that know the science applications. | ||
+ | ** '''instrument experts''' that know the subtleties of the observation mechanism. | ||
+ | ** '''algorithm experts''' that know the variations in retrievals. | ||
+ | ** '''process experts''' that know the subtleties of the processing implementation. | ||
+ | ** '''data format experts''' that know handling of for example HDF4 vs HDF5. | ||
* Archive | * Archive | ||
Line 41: | Line 46: | ||
** Who is the authority (if there is one) that is ascerting hat the data meet certain quality criteria (e.g. Nat. Weather Service) | ** Who is the authority (if there is one) that is ascerting hat the data meet certain quality criteria (e.g. Nat. Weather Service) | ||
** Criteria used in the certification | ** Criteria used in the certification | ||
+ | |||
+ | ==Notes== | ||
+ | *Relevant experts may include: | ||
+ | **science domain experts that know the science applications. | ||
+ | **instrument experts that know the subtleties of the observation mechanism. | ||
+ | **algorithm experts that know the variations in retrievals. | ||
+ | **process experts that know the subtleties of the processing implementation. | ||
+ | **data format experts that know handling of for example HDF4 vs HDF5. | ||
+ | *How best to capture the "gotchas" potentially introduced each step along the way? | ||
+ | *Suitability of data usage | ||
+ | **Mapping observations (e.g. variables) to appropriate science focus areas. |
Latest revision as of 23:37, December 12, 2013
Choosing a data set from multiple similar choices.
Summary
A research user needs to pick the data set from multiple similar data sets that best meets the user’s requirements for their intended application. An example could be a polar bear ecologist choosing a data set on sea ice conditions in a region of the Hudson Bay from the multiple data sets listed at NSIDC. Another example could be a user choosing which sea surface temperature data set from PO.DAAC to use in forcing a model of an agal bloom. Many other examples exist. Traditionally this was done by the user consulting a relevant expert. Ideally, one could conceive of an expert system helping guide the user through their query, if the system had access to sufficient information.
Relevant experts may include:
- science domain experts that know the science applications.
- instrument experts that know the subtleties of the observation mechanism.
- algorithm experts that know the variations in retrievals.
- process experts that know the subtleties of the processing implementation.
- data format experts that know handling of for example HDF4 vs HDF5.
How best to capture the "gotchas" potentially introduced each step along the way?
Suitability of data usage
- Mapping observations (e.g. variables) to appropriate science focus areas.
Actors
- Research user
- Data expert(s)/Expert system
- science domain experts that know the science applications.
- instrument experts that know the subtleties of the observation mechanism.
- algorithm experts that know the variations in retrievals.
- process experts that know the subtleties of the processing implementation.
- data format experts that know handling of for example HDF4 vs HDF5.
- Archive
Sequence of Events
- User poses initial request to expert
- Expert queries user on specifics
- Iteration between user and expert to understand vocabularies and actual needs
- Initial possible data sets are identified by basic criteria like whether the data set covers the right time and location
- The list is further refined by more qualitative criteria specific to the actual query
- A recommended data set or ranked list of data sets is returned to the user
PCCS Artifacts
- Data usage information
- Informal feedback from users (e.g., Amazon-style comments)
- publications about the data
- Publications that use the data
- Data “peer review” information. This is ill defined, but could include
- audit information about practices and processes to produce and maintain the data
- Advise from scientific advisory groups, etc.
- ….
- Authority or certification information
- Who is the authority (if there is one) that is ascerting hat the data meet certain quality criteria (e.g. Nat. Weather Service)
- Criteria used in the certification
Notes
- Relevant experts may include:
- science domain experts that know the science applications.
- instrument experts that know the subtleties of the observation mechanism.
- algorithm experts that know the variations in retrievals.
- process experts that know the subtleties of the processing implementation.
- data format experts that know handling of for example HDF4 vs HDF5.
- How best to capture the "gotchas" potentially introduced each step along the way?
- Suitability of data usage
- Mapping observations (e.g. variables) to appropriate science focus areas.