Data Maturity Matrix
Purpose
This wiki page intends to collect the resources and information relevant to the evaluation and improvement of the Scientific Data Stewardship Maturity Matrix.
Meeting Notes
- 2015-04-06: Meeting Notes
- 2015-03-02: Meeting Notes
Publication
- Peng, G., Privette, J. L., Kearns, E. J., Ritchey, N. A., & Ansari, S. (2015). A unified framework for measuring stewardship practices applied to digital environmental datasets. Data Science Journal, 13, 231-253. doi:10.2481/dsj.14-049
- Peng, G., Ritchey, N. A., Casey, K. S., Kearns, E. J., Privette, J. L., Saunders, D., Jones, P., Maycock, T., & Ansari, S. (2016). Scientific stewardship in the open data and big data era — Roles and responsibilities of stewards and other major product stakeholders. D-Lib Magazine, 22(5/6). doi:10.1045/may2016-peng
Resources
- High-Level Background on the Data Stewardship Maturity Matrix
- Data Stewardship Maturity Matrix Template
- PDF Version of the Data Stewardship Maturity Matrix
- A review of 6 preservation maturity models by J. Bailey
Use Case Examples
1) Dataset Title: NCAR Global Climate Four-Dimensional Data Assimilation (CFDDA) Hourly 40 km Reanalysis
- Dataset Description: NCAR Global Climate Four-Dimensional Data Assimilation (CFDDA) Hourly 40 km Reanalysis dataset is a dynamically-downscaled dataset with that was created using NCAR's CFDDA system. The dataset contains three-dimensional hourly analyses in netCDF format for the global atmospheric state from 1985 to 2005 on a 40 km horizontal grid (0.4 degree grid increment) with 28 vertical levels.
- Dataset Landing Page: http://rda.ucar.edu/datasets/ds604.0/#!description
- Dataset Identifier: http://dx.doi.org/10.5065/D6M32STK
- Dataset's Data Maturity Matrix Evaluation Result: Media:NCAR_CFDDA_DataMaturityMatrix_v01r04_20150730.pdf
- Note: Evaluation was performed also to provide feedback regarding Rev.1 v3.1 02/26/2015 of the Data Maturity Matrix template.
2) Dataset Title: SBC LTER: pH time series: Water-sample pH and CO2 system chemistry, ongoing since 2011
- Dataset Description: Data are a time-series of pH and carbonate chemistry in manually-collected sea water samples from nine near-shore locations along the Santa Barbara Channel, California, USA, and intended to benchmark data from moored pH instruments. Data collection began in June 2011 and is ongoing.
- Dataset Landing Page: http://sbc.lternet.edu/cgi-bin/showDataset.cgi?docid=knb-lter-sbc.75
- Dataset Identifier: http://dx.doi.org/10.6073/pasta/e06713c1ee0e48b41162767b4610b451
- Dataset's Data Maturity Matrix Evaluation Result: Media:SBC75_DataMaturityMatrix_Evaluation_Rev1.pdf
- Note: Evaluation was performed also to provide feedback regarding Rev.1 v3.1 02/26/2015 of the Data Maturity Matrix template.
Q&As
Please submit additional questions via ESIP-Preserve list-serv.
* What are those entities at the top of the matrix (pdf version)?
Those are entities under which “non-functional” requirements are asserted on scientific data stewardship. The terms non-functional and functional requirements are often used in systems engineering to define, in a broad sense, what a system is supposed to be and to do. The term “non-functional requirements” is used here to refer to constraints imposed by U.S federal regulations and agency policies on the stewardship of environmental data.
* Does this maturity refer to that of an organization?
No. This maturity assessment model measures stewardship practices applied to individual digital Earth Science datasets, leveraging community best practices and standards. However, it could be used as one of the factors in determining the maturity of an archive or repository.
* How does this stewardship maturity assessment model differ from the previous preservation maturity assessment models?
It distinguishes itself from most of the existing preservation maturity models in the following aspects:
- It is dataset-oriented as opposed to process-oriented, providing a unified framework to assess the robustness of quantifiable stewardship practices that are applied to individual Earth Sciences datasets.
- It stresses data quality and the scientific oversight in data and metadata quality and usability that are critical to climate environmental data products and their users and stakeholders.