EnviroSensing Monthly telecons

From Federation of Earth Science Information Partners
Revision as of 13:20, December 2, 2014 by Griesc (talk | contribs)

back to EnviroSensing Cluster main page

Telecons on the fourth Tuesday of every month at 4:00pm ET. Click on 'join'

Next telecon December 23, 2014, 4:00 PM EDT. No telecon due to Christmas

Notes from past telecons


Jordan Read presented the SensorQC R package

The recorded session may be downloaded here: download


Wade Sheldon presented the GCE Data Toolbox – a short summary follows:

  • Community-oriented environmental software package
  • Lightweight, portable, file-based data management system implemented in MATLAB
  • generalized technical analysis framework, useful for automatic processing, and it's a good compromise using either programmed-in or file-based operations
  • Generalized tabular data model
  • Metadata, data, robust API, GUI library, support files, MATLAB databases
  • Benefits and costs: platform independent, sharing both code and data seamlessly across the systems, version independent as far as MATLAB goes, and is now "free and open source" software. There is a growing community of users in LTER.

Toolbox data model

  • Data model is meant to be a self-describing environmental data set-- the metadata is associated with the data, create date and edit date and such are maintained, and its lineage.
  • Quality control criteria- can apply custom function or one already in the toolbox
  • Data arrays, corresponding arrays of qualifier flags -- similar to a relational database table but with more associated metadata

Toolbox function library

  • The software library is referred to as a "toolbox"
  • a growing level of analytical functions, transformations, aggregation tools
  • GUI functions to simplify the usage
  • indexing and search support tools, and data harvest management tools
  • Command line API but there is also a large and growing set of graphical form interfaces and you can start the toolbox without even using the command line

Data management framework

  • Data management cycle - designed to help an LTER site do all of its data management tasks
  • Data and metadata can be imported into the framework and a very mature set of predefined import filters exist: csv, space- and tab-delimited and generic parsers. Also, specialized parsers are available for Sea-Bird CTD, sondes, Campbell, Hobo, Schlumberger, OSIL, etc.
  • Live connections i.e. Data Turbine, ClimDB, SQL DB's, access to the MATLAB data toolbox
  • Can import data from NWIS, NOAA, NCDC, etc.
  • Can set evaluation rules, conditions, evaluations, etc.
  • Automated QC on import but can do interactive analysis and revision
  • All steps are automatically documented, so you can generate an anomalies report by variable and date range which lets you communicate more to the users of the data

The recorded session may be downloaded here: download


  • Fox Peterson (Andrews LTER) reported on QA/QC methods they are applying to historic climate records (~13 million data points for each of 6 sites).

The challenge was that most automated approaches still produced too many flagged data that needed to be manually checked. Multiple statistical methods were tested based on long-term historical data. The method they selected was to use a moving window of data from the same hour over 30 days and test for 4 standard deviations in that window; E.g., use all data for 1 pm for days 30 - 60 of the year, compute four standard deviations, and set the range for the midpoint day (45) at the 1pm hour to that range.

  • Josh Cole reported on his system, which is in development and he will be able to share scripts with the group.
  • Brief discussion of displaying results using web tools.
  • Great Basin site discussed the variability in their data, which "has no normal"-- how could we perform qa/qc based on statistics and ranges in this case?
  • Discussion of bringing Wade Sheldon to call next time / usefulness of the toolbox for data managers
  • Discussion of using Pandas package- does anyone have experience, can we get them on?
  • Discussion of the trade off between large data stores, computational strength, and power. Good solutions?
  • ESIP email had some student opportunities which may be of interest
  • Overall, it was considered helpful if people were willing to share scripts. Discussion of a GIT repository for the group, or possibly just use the Wiki.

The recorded session may be downloaded here: download


Suggestions for future discussion topics

  • Citizen Science contributions to environmental monitoring
  • 'open' sensors - non-commercial sensors made in-house, technology, use, best practices
  • Latest sensor technologies
  • Efficient data processing approaches
  • Online data visualizations
  • New collaborations to develop new algorithms for better data processing
  • Sensor system management tools (communicating field events and associating them with data)

Recorded session

Play download