Agclimate telecon 2018-02-06

From Earth Science Information Partners (ESIP)
Revision as of 16:29, January 31, 2018 by Wteng (talk | contribs) (Created page with " Back to Ag & Climate Workspace<br> Agenda * Wrap up on Winter [http://www.esipfed.org/meetings/upcoming-meetings/esip-winter-meeting-2018 "Enha...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Back to Ag & Climate Workspace

Agenda

The International Soil Carbon Network’s (http://iscn.fluxdata.org/ ) Soil organic carbon data recovery and harmonization repository (SOC-DRaHR) is an open source project that identifies and harmonizes data sets of interest to soil carbon research. This project harnesses community interest in developing harmonized data sets, by allowing data providers and meta-analysis authors to contribute scripts to translate data sets from their archived format to a common harmonized format. The main repository is here: https://github.com/ktoddbrown/soils-long-tail-recovery, with supporting R package located here: https://github.com/ktoddbrown/soilDataR.

Past efforts to create a centralized soil database have focused on developing generalized templates which data providers could populate. These templates are inevitably both too large and too small--having both fields for measurements that were not collected for specific studies and not enough fields to capture the measurements that were taken. While dynamic template generation can address some of these concerns, there remains the issue of manual transcription errors and personnel time.

The ISCN and others (notably the Environmental Data Initiative) have recently taken an alternative approach of using customized scripts to read the archived data into a common internal format, and then using scripts to generate the desired data product. When these scripts are linked to digitally archived data sets, there is a clear provenance enabling reproducible science. Writing these scripts requires a deep understanding of the data set but only modest computational skills. SOC-DRaHR seeks to lower the computational barrier even further, by providing tools and training to enable soil-trained scientists to become data scripters. In the future, SOC-DRaHR will enable data rescue by defining and developing best practices and free tools to collect data from older studies that may have data preserved in graphs and other formats that are not machine readable.

Soil data are increasingly being archived online, yet this is not enough in and of itself to make data reusable. Harmonizing these data through an open community project like SOC-DRaHR is critical to leveraging new insights from old data.



Telecon Notes
January 2018 Telecon Notes

Attendees:
Bill Teng
Nancy Hoebelheinrich