Sensor Data Quality

Return to EnviroSensing Cluster main page

Contacts

The primary editors for this page may be contacted for questions, comments, or help with content additions.

Don Henshaw – U.S. Forest Service Research, Pacific Northwest Research Station – don.henshaw at oregonstate.edu
Mary Martin – Hubbard Brook LTER, University of New Hampshire – mary.martin at unh.edu

Overview

A new generation of environmental sensors and recent major technological advancements in the acquisition and real-time transmission of continuously monitored environmental data provides a major challenge in providing quality assurance (QA) and quality control (QC) for high-throughput data streams. Deployments of sensor networks are becoming increasingly common at environmental research locations, and there is a growing need to access these large volumes of data in near real-time. However, the direct release of streaming sensor data raises the likelihood that incorrect or misleading data will be made available. Additionally, as research applications begin to rely on real-time data streams, the continual and consistent delivery of this information will be essential. This increasing access and use of environmental sensor data demands the development of strategies to assure data quality, the immediate application of quality control methods, and a description of any QA/QC procedures applied to the data.

Traditional QC systems tend to operate on file-based collections of environmental data from field sheets, field recorders or computers, or downloaded datalogger files. Manually applied tools and techniques such as graphical comparisons are used to provide data validation. Documentation is typically not well-organized and not directly associated with data values. The application of these systems must balance the need for release without months or years of delay versus the delivery of well-documented, high quality data. However, with increasing deployment of sensor networks, these older systems fail to scale or keep pace with user needs associated with high volumes of streaming data. Comprehensive and responsive QC systems are needed that are designed to reduce potential problems and can more quickly produce high quality data and metadata. Methods described here for building a QC system will include identification of:

preventative measures to be taken in the field
quality checks that can be performed in near real-time
necessary data management practices

Introduction

A team approach is necessary to build a QC system and multiple skills and personnel are needed. The QC system will begin with system design and preventative measures taken in the field and continue through data quality checking and data publishing. A lead scientist will propose research questions and describe the types of data and necessary quality. Expertise in field logistics, sensor systems and wireless communications will play a role in site design and construction. A sensor system expert will provide knowledge of specific sensors and programming skills to establish quality control checking. Field technicians with strong knowledge of the overall scientific goals and communication skills can help to articulate issues and discover solutions. A data manager will be needed to guide delivery and archival of documented data products. Communication among all parties is necessary for the most timely delivery of well-documented and high quality data.

All team members will be needed to define a QC workflow that is useful in describing procedures and personnel responsibilities as the data flows from field sensors to published data streams. A QC system must allow for an iterative, quality management cycle to accommodate feedback to policies, procedures, and system design as data collections continue over time. A system will depend on communication among team members to assure that noted sensor data collection and transport issues and problems are addressed quickly and documented in the data stream. An active, well-documented QC system will help to establish user-confidence in data products.

Automated or semi-automated QC systems are needed that can adequately review and screen source data and still provide for its timely release. Automated quality control processes such as range checking can be performed in near real-time and a system can assign data qualifier codes, or flags, for any sensor value when problems or uncertainty occurs in the data stream. However, these processes can often only indicate potential problems in the data stream that still require manual review. A comprehensive QC system is only achievable as a hybrid system demanding both automated QC checks and manual intervention to assure highest data quality.

For this chapter we will define quality assurance (QA) as those preventative processes or steps taken to reduce problems and inaccuracies in the streaming data. These will include sensor network design, protocol development for routine maintenance and sensor calibration, and best practice procedures for field activities and data management. Quality control (QC) primarily refers to the tests provided to check data quality and the assignment of data flags and other notations to qualify issues and describe problems. QC system refers to this complete set of QA/QC preventative and product-oriented processes.

Methods

Sensor Quality Assurance (QA)

Quality assurance (QA) refers to preventative measures and activities used to minimize inaccuracies in the data. For example, scheduling regular site visits and maintenance procedures, or continuously monitoring and evaluating site sensor behavior can prevent sensor failures or lead to early detection of problems. Designing networks with redundant sensor measurements provides an additional means to quality check sensor data and assure continuity of measurement. Of course, the time and expense to conduct high-level maintenance procedures or implement efficient and redundant designs may be limited by project budgets, but may be warranted by the importance of the data. Here we describe QA measures categorized by design, maintenance, and practices:

Design

Design for replicate sensors. Co-located sensors independent of the datalogger and included in the data flow can be useful checks. For example, check temperature measurements might be made alongside a Campbell thermistor with a HOBO pendant, SDI-12 temperature sensor, or analog thermocouple. Ideally, three replicate sensors are used so that sensor drift can be detected (with two sensors it may not be obvious which sensor is drifting).
Assure an adequate power supply. Power considerations might include adding a low voltage cutoff (LVD) to prevent logger “brown-out”, or adding power accessories with switched power supply (e.g. CSI logger, IP relay) to programmatically control optional devices (radios, power-cycle loggers).
Protect all instrumentation and wiring from UV light, animals, human disturbance, etc. such as with flex conduit or enclosures.
Implement an automated alert system to warn about potential sensor network issues or certain events, e.g., extreme storms. For example, automated alerts might signal low battery power, indicate sensor calibration is needed, or indicate high winds or precipitation.
Add on-site cameras or webcams. Webcams can be used to record weather or site conditions, animal disturbance or human access.

Maintenance

Schedule routine sensor maintenance. Routine site visits following standard protocols can assure proper maintenance activities.
Standardize field notebooks, check sheets or field computer applications to lead field technicians through a standard set of procedures and assure that all necessary tasks are conducted. These notebooks or applications can serve as an entry point for technical observations regarding potential problems or sensor failures.
Schedule routine calibration of instruments and sensors based on manufacturer specifications. Maintaining additional calibrated sensors of the same make/model can allow immediate replacement of sensors removed for calibration to avoid data loss. Otherwise, sensor calibrations can be scheduled at non-critical times or staggered such that a nearby sensor can be used as a proxy to fill gaps.
Anticipate common repairs and maintain inventory replacement parts. Sensors can be replaced before failure where sensor lifetimes are known or can be estimated.
Assure proper installation of sensors (correct orientation, clean wiring, solid connections and mounting, etc.). Protocols for installing new sensors will also assure that key information is logged regarding a sensor’s establishment (See Management section).

Practices

Maintain an appropriate level of human inspection. Develop the capability to easily view real-time data and examine regularly (daily/weekly). Regular inspection can help identify sensor problems quickly and might allow for fewer site visitations. Certain problems such as visible extreme spikes, intermittent values, or repetitive values can be easily viewed in raw data plots.
Spot check measurements with a reference sensor can be routinely used for some measurements, i.e. temperature, snow depth, etc. to verify the performance of in situ sensors.
A portable instrument package that can be rotated among sensor sites can be useful in identifying problems. The portable package might run alongside installed sensors over a fixed period (daily or longer cycle) to inspect for drifting or failing sensors. This type of co-location might be done to audit sensor performance on an annual or periodic basis.
Record the date and time of known events that may impact measurements (see Management section). Ideally, these notes can be entered or captured for automated access. For example, sensors are known to demonstrate alternative behavior during site visits or maintenance activities, and light or trip sensors might be used in recording sensor access.
Routinely synchronize the time clock on dataloggers with the public Network Time Protocol (NTP) server (http://www.ntp.org/).
Provide a reference time zone and avoid changing data logger timestamps for daylight savings time. Many would argue the best practice is to output data in Coordinated Universal Time (UTC), which is particularly useful when data spans multiple time zones. However, most local users of the data prefer seeing output in local standard time because it corresponds to local ecological conditions, i.e., ocean tides or solar noon, and may ease troubleshooting or field-based checking. Another strategy is to provide the local offset from UTC within the data stream to allow simple conversion to UTC, or allow users to query the data and choose whatever time zone they would like to receive the data in. ISO 8601 (http://www.iso.org/iso/home/standards/iso8601.htm) is an international standard covering the exchange of date and time-related data and provides timezone support. For example, 2013-09-17T07:56:32-0500 provides the offset from an EST timezone, however, lack of support in many instruments and software packages is a drawback to its use. Recently, REST services are constructed to allow the return of datetime values with an implicit timezone offset enabling convenient sharing of data with timestamp flexibility.
Ensure that files stored on the logger are transmitted error-free to the data center for import (use error-corrected protocols like FTP, Ymodem and HTTP). Schedule manual file download and post-import checks if non-error-corrected protocols are used as an interim measure.

Quality Control (QC) on data streams

Quality Control of data streams involves automated or semi-automated processes whereby values and associated timestamps are cross-checked against predetermined standards and separate concurrently-collected data streams. QC takes place post-collection during the streaming process or after data is assimilated into a central database. Some processes can be performed in “near real-time”, or at the time the data streams are brought into the database, and data can be released as “provisional” after this initial inspection to satisfy immediate user needs. Other processes may require some delay such as trend analysis for sensor drift detection. Results of these tests are typically accounted for in a data qualifier flag for each value. Manual inspection and resolution of suspect or problem data is also a necessary step before data is released with “provisional” tags removed. Revised or corrected data versions can be published at a later date, and it is important to provide documentation on the types of quality checks conducted with each release of these data.

Three categories of automated or semi-automated QC processes can be described:

independent evaluation, whereby a single data point is checked against predetermined standards (such as range checks)
point-to-point evaluation, whereby a single data point is compared to other concurrently-observed data points (such as replicate sensors)
many-point, or trend analysis, where some timeframe of observations are examined statistically or against other data trends. The first two are essentially near real-time checks, whereas the third can involve timeframes several orders of magnitude longer than the measurement interval.

Near real-time processing involves automated checking of each data point and its associated date and time. Data qualifier codes, or data flags, will be assigned based on these checks. These automated checks and flag assignments are essential in processing the mass volumes of data streaming from sensor networks, but are not sufficient. Human inspection of data is critical and particularly might focus on data points that are flagged by an automated system. The following terminology corresponds with quality control tests listed in Campbell et al. 2013.

The most common and simplest checks to implement

Timestamp integrity checks – ensures that each date-time pair is sequential. With fixed interval data it is possible to cross-check the recorded and expected timestamp.
Range checks - ensures that all values fall within established upper and lower bounds. Bounds can be established based on the specific sensor limitations, or can be based on historical seasonal or finer time-scale ranges determined for that location. Separate flags might be assigned to qualify impossible values (based on sensor characteristics) versus extreme values that are outside of the historic norms but within the sensor operating range.

Other checks can be employed for near real-time or in post-streaming QC

Persistence - checks for repeated, unchanging values in measures where constant change is expected.
Spike detection - checks for sharp increases or decreases from the expected value in a short time interval such as a spike or step function. These tests often employ statistical measures such as the standard deviation of the preceding values in detecting outliers or spikes that exceed 2-3 sigma (standard deviations) from what is expected. An alternative algorithm is to check to see that the median value of points t, t+1 and t-1 is not more than a fixed magnitude from point t.
Internal consistency – plausibility checks for consistency between related measurements such as that the maximum value is greater than the minimum value, or that snow depth is greater than its snow water equivalence. These checks may also examine values that are not possible under known conditions such as incoming solar radiation recorded during nighttime.
Spatial consistency – checks for sensor drift or failure based on intersite comparisons of nearby identical sensors. The integration of several data streams may be possible in post-processing and drifting may be detected based on known correlations or prior conditioning with redundant or nearby sensors.

Data qualifiers (data flags)

The QC system must be able to assign one or more codes to each data point based on the result of QC tests or other available information. Data flags may be assigned during the initial QC tests that are intended to guide local review in identifying erroneous or problematic data (e.g., invalid values out of range or below detection level), or might be flags that indicate site-specific events (e.g., low battery voltage, an icing or other event or site condition, or notification of a due date for sensor calibration). These internal flags may use a richer vocabulary of fine-grained flags than what is necessary to share publicly. Reviewing internal flags is necessary to resolve issues that may be evident in the data before these data are made available in final published versions. Some systems might employ a “rejected” flag as a means of preserving an original value but allow capability to withhold that value from public use.

External flags provided in published data will likely be a more general, simpler suite of flags better suited for public consumption. Multiple internal flags would be mapped into this more general flag set. While many vocabularies are in use, an example suite of external flags follows:

A: Accepted
E: Estimated
M: Missing
Q: Questionable
Specification of uncertainty

The “Accepted” flag should be assigned to values where no apparent problems are discovered, but the QC tests that were applied should be described. The “Accepted” flag is likely less commonly used than simply leaving the flag blank. If the blank flag is used it should be included in the list of flags and defined, e.g., “no QC tests were applied” or “no recognizable problems” or “provisional data”. A blank flag can be included in an enumerated listing of valid flags but may not be the best practice within some metadata standards. A “Provisional” flag is not listed here but may be appropriate. Alternatively, “provisional” data might be indicated within a “quality level” attribute on the record level or file level rather than associated with an individual measurement (See Data Quality Level section below).

Examples of Quality Flag Sets (listed codes may only represent a subset of each flag set)
Andrews LTER	WISKI (Univ. of Saskatchewan)	HFR LTER	VCR LTER	SeaDataNet
A - Accepted	10 - Rejected	M - Missing	Blank - OK	0 - no QC
E - Estimated	15 - Disregard	E - Estimated	Q - Questionable	1 - Good value
M- Missing	20 - Manually edited	Q - Questionable	M - Missing	2 - Probably good
Q - Questionable	25 - Simulated		R - Range Error	4 - Bad value
Measurement specific, e.g., B - Below detection	30 - Filled		S - Data Spike	6 - Below Detection

The evaluation of extreme values may benefit from “expert inspection” that can be built into the QC system. Historical ranges can be developed for sites with long-term sensor measurements at annual, seasonal or finer time scales. For remote sites that are data sparse these ranges may be a primary tool for ascertaining data quality, and, for example, a QC system may flag values that fall outside of two standard deviations of long-term means. Where other nearby in situ measurements are available or where national surface station networks are available, quality checks may be improved through comparison of values. Access to multiple climate elements may provide the ability to create relationships among stations and allow specification of uncertainty for all values. Evaluation of a QC system’s performance in determining uncertainty or in estimating values will be important in making system improvements and potentially allowing a retrospective re-application of quality control (Daly et al. 2005).

Where specifications of uncertainty cannot be determined, values may be deemed “Questionable” by an automated system. Ultimately, manual evaluation may be required and a decision made as to whether a data point can be released as “Accepted” versus removing from the data stream and listing as “Missing” versus leaving the value flagged as “Questionable”. As Daly et al. 2005 points out, “in the end, the fundamental dilemma with nearly all quality control is a tension between the relative merits and costs of accidentally rejecting good data, or accidentally accepting bad data, and a tradeoff is usually involved”.

Where data are missing, an option might be to fill gaps with “Estimated” data. From Campbell et al. 2013, “filling these gaps may enhance the data’s fitness for use but can possibly lead to misinterpretation or inappropriate use, and can be a complex endeavor. The decision about whether to fill gaps and the selection of the method with which to do so are subjective and depend on factors such as the length of the gap, the level of confidence in the estimated value, and how the data are being used”.

Data quality level

The level of QC testing applied to a set of data should be well-described and transparent to the data user. Publishing of data is independent of data quality, and users need to be able to quickly identify its quality level, for example, to discern whether the data is unchecked, raw data vs. thoroughly inspected and reviewed. Groups such as NEON and CUAHSI have assigned a quality level to data products including original raw data, initially inspected and flagged raw data, published raw data, and estimated, gap-filled or other synthetic products involving model-based or scientific interpretation (See references in data_quality_level.pdf). While these groups do not necessarily agree on the actual level assignment, there are some general concepts of quality level that can be agreed upon and are represented here:

Level 0 (raw) ‐ Unfiltered, raw data, with no QC tests applied and no data qualifiers (flags) applied - Typically, these are original data streams that are not published but that should be preserved. Data quality flags are not assigned. Conversion of raw measurement values to more meaningful units may be acceptable, e.g., thermocouple table conversions of millivolts to degrees C.

Level 1 (provisional)‐ Provisional data released in near real-time with initial QC testing applied - Preliminary QC tests or data calibration are applied, potentially in near real-time through automated scripts. Data qualifiers are assigned and may be for internal use intended to guide further review of the data (See Data qualifiers subsection). All data qualifiers should be well-defined. Range and date-time checking are commonly applied to this provisional level. The QC tests applied should be well-described.

Level 1 (published) - Published data with a delayed release after automated and manual review - QC testing is complete and suspect data has been inspected and flagged appropriately. Each value is assigned a data qualifier and the set of flags may be a more simple set devised for public use of the data. Impossible or missing values would be assigned an appropriate missing value code and a data flag of “Missing”. Data would no longer be considered provisional and would be unlikely to change.

Level 2 (gap-filled) - Gap-filled or estimated data involving interpretation - This is quality enhanced data where careful attention has been applied to estimate or fill gaps in data or to otherwise build derived data to accommodate data user needs, for example estimate gaps in a sensor stream using a nearby sensor. As gap-filling typically involves interpretation and may employ multiple models or algorithms, other versions of level 2 data may be used in practice. Methods employed in gap-filling or deriving data should be well-described.

Aggregating data from one time-step to another, e.g., creating daily summary data from 10 minute data, that does not involve any interpretation in that simple means, maximum, and minimums are determined would not necessarily alter the quality level. That is, mean daily temperature determined from level 1 (published) data would still retain a quality level 1. However, interpretation may be involved when determining an appropriate qualifier flag for the daily mean. For example, if some of the 10 minute observations are missing at what point does the daily mean also become missing (e.g., more than 20% are missing) or become questionable (e.g., more than 5% are missing). This type of processing may yield daily mean values that are best described as Level 2 as interpretation is involved.

Data collection interval

Data loggers offer the capability to easily output mean data values at multiple time steps, e.g., 10 minutes, hourly, daily. Saving values at multiple time steps may present an extra complication in the QC process as separate tables are usually stored for each timestep. When a single sensor measurement is reported at separate time steps, conflicting QC results may occur if both streams are QC’d independently. One strategy to simplify this problem is to output most or all data in the shortest common timestep and use post-processing to statistically aggregate the data at longer time steps. For example, a system might QC and output the 10 minute data and then aggregate hourly and daily values from this finer resolution 10 minute data stream. Dataloggers might typically calculate and output daily (24-hour) data streams, but accurate QC may be impossible as the exact values used in this aggregation are unknown, and the aggregation may be only representing a subset of values, e.g., if there was a power discontinuity to the logger. However, there may be cases where the output of daily values by the logger are important. For example, an instantaneous maximum or minimum value based on a single logger sample would not be captured through this aggregation, and a daily minimum or maximum based on a 10 minute or hourly mean output may differ significantly from the instantaneous value.

Data Management

Timing of QC system processes

Automated QC system procedures provide the most timely and efficient processing of streaming data. The use of system procedures provides consistent assignment of data flags and removes much of the subjectivity inherent in manual assignment. Ideally, the QC system will be employed every time data is acquired, e.g., every 10 minutes, and secondarily operate on hourly or daily time periods. More comprehensive visual or programmatic checks or the assignment of uncertainty using nearby or other related sites might occur at a later time. The frequency and timing of a manual or visual review processes will depend on the data flow at the site, software stack, and data processing capabilities. The necessary timeframe for data delivery of provisional versus fully processed data should be considered.

Documentation of the QC processes

The documentation of QC processes should identify the near real-time streaming QC methods including assumptions and thresholds, and additional algorithms or visual methods applied. If no QC is applied that should be made apparent. Descriptions of data processing and QC workflows are also useful in describing data provenance and all workflow versions should be retained (See example workflow). Data measurement attributes and qualifier flags should be defined.

The application of the QC tests employed or any algorithms applied to aggregate, estimate or gap-fill data should be described for all data levels, and data levels can potentially be defined in conjunction with a data release policy. Ideally, data at each level should be locally archived. Level 0 raw data should be retained locally in its original, unmanipulated state. Level 1 (published) or level 2 data may be the best candidates for more formal archiving. Data sets should be transparently tagged with a data quality level as data are released.

Sensor data documentation

Develop and use a common vocabulary and syntax for sensor measurement attribute names and file naming conventions. Research organizations with multiple sensor sites measuring common sets of parameters can greatly improve efficiency and more easily employ automated methods when a common vocabulary is employed. These naming conventions should be planned from the outset into datalogger programs and other software employed within the data flow.

Data qualifier flags provide documentation for each measured value and should be placed alongside the value as data files are produced for archival storage. An additional attribute or method code may also be added to note shifts in method or instrumentation or other key changes in collection procedures. Inclusion of a method code directly within the data file places key documentation close to the data value and is more visible to the data user. In long-term data streams where the quality level may change over time, e.g., periods of time where gap-filling is employed, a data quality attribute might be used to assign data quality at the record or measurement level.

Best Practices

Reorganized from: Campbell et. al. 2013.

Sensor Quality Assurance (QA)

Maintain an appropriate level of human inspection
Replicate sensors, n=3 is optimal
Schedule maintenance and repairs to minimize data loss
Have ready access to replacement parts
Record the date, time, and timezone of known events that may impact measurements
Implement an automated alert system to warn about potential sensor network issues

Quality Control (QC) on data streams

Ensure that data are collected sequentially
Perform range checks on numerical data
Perform domain checks on categorical data
Perform slope and persistence checks on continuous data
Compare data with data from related sensors
Use flags to convey information about the data
Estimate uncertainty in the value, if feasible
Correct data or fill gaps if it is prudent

Data management

Automate QA/QC procedures
Retain the original unmanipulated data
Indicate data quality level with each release of the data
Provide complete metadata
Document all QA/QC procedures that were applied and indicate data quality level
Document all data processing (e.g., correction for sensor drift)
Retain all versions of workflows and metadata (data provenance).

Case Studies

We are looking for case studies that will describe some complete QC systems, QC processing and general setup (e.g., number and type of sensors, dataloggers, telemetry, etc.)
Examples using GCE Toolbox, Vista Data Vision, R, etc. would be useful
General workflow example from Nevada Research Data Center

References

Campbell, JL, Rustad, LE, Porter, JH, Taylor, JR, Dereszynski, EW, Shanley, JB, Gries, C, Henshaw, DL, Martin, ME, Sheldon, WM, Boose, ER. 2013. Quantity is nothing without quality: Automated QA/QC for streaming sensor networks. BioScience. 63(7): 574-585. http://www.treesearch.fs.fed.us/pubs/43678

Taylor, JR and Loescher, HL. 2013. Automated quality control methods for sensor data: a novel observatory approach, Biogeosciences, 10, 4957-4971 doi

Daly, C, Redmond, K, Gibson, W, Doggett, M, Smith, J, Taylor, G, Pasteris, P, Johnson, G. 15th AMS Conf. on Applied Climatology, American Meteorological Soc. Savannah, GA, June 20-23, 2005. pdf

Resources

QC Resources

Campbell et al. 2013 Bioscience http://www.treesearch.fs.fed.us/pubs/43678
Taylor and Loescher 2013. Biogeosciences http://www.biogeosciences.net/10/4957/2013/bg-10-4957-2013.pdf doi
Daly et al. 2005. 15th AMS Conf. on Applied Climatology, Amer. Meteorological Soc. http://ams.confex.com/ams/pdfpapers/94199.pdf
CUAHSI http://wdc.cuahsi.org/wdc/Docs/ODM1.1DesignSpecifications.pdf
NOAA Satellite and Information Service (National Climate Data Center) http://www.ncdc.noaa.gov/oa/climate/ghcn-daily/
Carbon dioxide information analysis center http://cdiac.ornl.gov/epubs/ndp/ushcn/daily_doc.html
SeaDataNet http://www.seadatanet.org/Standards-Software/Data-Quality-Control
Data Quality Assessment: Statistical Methods for Practitioners http://www.epa.gov/quality/qs-docs/g9s-final.pdf

Flag set examples

NOAA National Climatic Data Center http://www.ncdc.noaa.gov/oa/hofn/coop/coop-flags.html
Campbell et al. 2013 Bioscience (See p. 580) http://www.treesearch.fs.fed.us/pubs/43678

Data quality level

NEON http://www.neoninc.org/documents/513
CUAHSI http://his.cuahsi.org/documents/ODM1.1DesignSpecifications.pdf, pp. 19-20, 57-58
Ameriflux http://public.ornl.gov/ameriflux/available.shtml
Earth Science Reference Handbook http://eospso.gsfc.nasa.gov/sites/default/files/publications/2006ReferenceHandbook.pdf (p.31)
ILRS Data products: (CODMAC - Committee on Data Management, Archiving and Computing) http://ilrs.gsfc.nasa.gov/about/reports/9809_attach7b.html