Difference between revisions of "P&S Data Quality"
m |
m |
||
(18 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
''Back to:'' [[Products and Services]] | ''Back to:'' [[Products and Services]] | ||
---- | ---- | ||
− | + | '''NOTE:''' This does not yet incorporate a second discussion in April. | |
− | + | There will be a six hour workshop at the 2006 Summer Meeting | |
− | + | (see discussion May 23). | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
+ | ==Objective== | ||
+ | Create a common set of data quality metrics across all Federation data products. Data providers can provide measures for their own products. 3rd parties can provide their own ratings. Quality can refer to accuracy, completeness, and consistency. It is not clear how to measure consistency. It is desirable to provide quality assurance. | ||
+ | |||
+ | We would like to create a 1-10 Data quality scale, where: | ||
+ | 1 = no accuracy claimed | ||
+ | 10 = fully reliable data that has withstood the test of time | ||
+ | |||
+ | This measure can be applied to any of the quality dimensions: | ||
+ | |||
+ | ===Quality Dimensions=== | ||
+ | #Sensor/Instrument (well calibrated, stable, checked across instruments, V/V) | ||
+ | #Spacecraft (locational and communication accuracy) | ||
+ | #Environment Issues (contamination from clouds, rainfall, ground, sea, dirt, etc.) | ||
+ | #Data Processing (accuracy of interpolation, algorithms, ancillary source data) | ||
+ | |||
+ | ===Our Task=== | ||
+ | Create a 1-10 scale for each dimension. We will work with Federation members to associate a quality description with each value. | ||
+ | |||
+ | |||
+ | ===Other topics=== | ||
:Quality assurance (someone tags it as valid) | :Quality assurance (someone tags it as valid) | ||
::Useful metadata provided? | ::Useful metadata provided? | ||
− | |||
:Instrument Verification and Validation | :Instrument Verification and Validation | ||
Line 28: | Line 41: | ||
::Chain of Custody (for legal use) | ::Chain of Custody (for legal use) | ||
− | + | Completeness | |
− | + | Can we come up with categories of data completeness? | |
− | |||
− | |||
===3rd party ratings=== | ===3rd party ratings=== | ||
::NCDC | ::NCDC | ||
− | :::NCDC Certified data (only states that it is in the archive) | + | :::NCDC Certified data (only states that it is in the archive -- designates as official, not a quality statement) |
+ | :::Dataset docs use FGDC quality section, with different levels of detail | ||
::GCMD | ::GCMD | ||
:::DIF records have some minimum required fields to accept | :::DIF records have some minimum required fields to accept | ||
Line 45: | Line 57: | ||
:::Maturity Model approach for data (John Bates application from software maturity) | :::Maturity Model approach for data (John Bates application from software maturity) | ||
:::Level of maturity (five levels of improved treatment) | :::Level of maturity (five levels of improved treatment) | ||
+ | :::See [[Media:Bates-Barkstrom_CDR_maturity_paper.pdf|CDR Maturity paper]] | ||
::FGDC | ::FGDC | ||
:::Whole section on quality, text only | :::Whole section on quality, text only | ||
::Testimonials | ::Testimonials | ||
::Peer review | ::Peer review | ||
+ | ---- | ||
+ | |||
+ | ===Discussion=== | ||
+ | ====Completeness==== | ||
+ | *Is this a measure of quality? | ||
+ | ::Depends on stated offering from the provider; if they claim it is complete and it isn't | ||
+ | ====Assertions about datasets==== | ||
+ | We may want some standard for claiming and measuring how valid a claim may be | ||
+ | |||
===Additional Questions=== | ===Additional Questions=== |
Latest revision as of 15:30, May 23, 2006
Back to: Products and Services
NOTE: This does not yet incorporate a second discussion in April. There will be a six hour workshop at the 2006 Summer Meeting (see discussion May 23).
Objective
Create a common set of data quality metrics across all Federation data products. Data providers can provide measures for their own products. 3rd parties can provide their own ratings. Quality can refer to accuracy, completeness, and consistency. It is not clear how to measure consistency. It is desirable to provide quality assurance.
We would like to create a 1-10 Data quality scale, where:
1 = no accuracy claimed 10 = fully reliable data that has withstood the test of time
This measure can be applied to any of the quality dimensions:
Quality Dimensions
- Sensor/Instrument (well calibrated, stable, checked across instruments, V/V)
- Spacecraft (locational and communication accuracy)
- Environment Issues (contamination from clouds, rainfall, ground, sea, dirt, etc.)
- Data Processing (accuracy of interpolation, algorithms, ancillary source data)
Our Task
Create a 1-10 scale for each dimension. We will work with Federation members to associate a quality description with each value.
Other topics
- Quality assurance (someone tags it as valid)
- Useful metadata provided?
- Instrument Verification and Validation
- Data processing
- Re-processing tag and notification
- input errors and forcings
- re-gridding
- missing data
- Usage issues
- High enough resolution?
- Valid inference about what is measured
- Chain of Custody (for legal use)
Completeness Can we come up with categories of data completeness?
3rd party ratings
- NCDC
- NCDC Certified data (only states that it is in the archive -- designates as official, not a quality statement)
- Dataset docs use FGDC quality section, with different levels of detail
- GCMD
- DIF records have some minimum required fields to accept
- then have a text field to describe quality
- ECHO
- "measured parameters" from ECS model
- QA percent cloud cover; missing pixels;
- CLASS/Climate Data Record
- Maturity Model approach for data (John Bates application from software maturity)
- Level of maturity (five levels of improved treatment)
- See CDR Maturity paper
- FGDC
- Whole section on quality, text only
- Testimonials
- Peer review
- NCDC
Discussion
Completeness
- Is this a measure of quality?
- Depends on stated offering from the provider; if they claim it is complete and it isn't
Assertions about datasets
We may want some standard for claiming and measuring how valid a claim may be
Additional Questions
- What common data quality standards can the Federation offer within the Earth Information Exchange?
- How can we enforce these standards within the Earth Information Exchange?
- Are there similar ratings for "data services"?