EnviroSensing Monthly telecons

From Earth Science Information Partners (ESIP)
Revision as of 12:21, May 2, 2016 by Alison Adams (talk | contribs)

back to EnviroSensing Cluster main page

Telecons are on the first Monday of every month at 2:30pm ET. Click on 'join'

Schedule

  • December 23, 2014, 4:00 PM EST. No telecon - Happy Holidays
  • January 27,2015, 4:00 PM EST. Rick Susfalk, Division of Hydrologic Sciences, Desert Research Institute

Notes and recordings from past telecons

To watch the recordings make sure you allow popups for https://esipfed.webex.com

05/02/2016

Attendees:
Don Henshaw (Forest Service in Oregon)
Alison Adams (EnviroSensing Student Fellow)
Annie Burgess (ESIP)
Lindsay Barbieri (Student Fellow)
Carlos Rueda (software engineer at Monterrey, part of XDOMES team)
Corinna Gries (LTER)
Felimon Gayanilo (Texas A&M, XDOMES)
Janet Fredericks (WHOI, XDOMES)
Wade Sheldon (LTER)
Mark Bushnell (oceanographer, XDOMES)
Jane Wyngaard (JPL)

Notes:

Mark Bushnell – Quality Assurance and Quality Control of Real-Time Ocean Data (QARTOD):

QARTOD manuals: focus on real time, usually coastal
Includes quality control tests and quality assurance of sensors (in an appendix)
Discussion of operational vs. scientific quality control and different needs/contexts for each
Board meets quarterly to review progress and identify next variables (if you have ideas for variables, let Mark know!)
Each manual takes 6-8 months and each one is a living document that is updated
26 core variables
next up: phytoplankton species!
Discussed an example test from the waves manual
Five “states” for data qc flags (pass, not evaluated, suspect or of high interest, fail, missing data)

Questions for Mark:

How to handle flags that represent a mix of semantic notions? Hard for data consumers to understand (what’s the actual problem?)
What about showing (or not showing) data that doesn’t meet a certain standard?
If you’re looking for extreme events, for example, you might want to see all the data…
Helpful to have the option to see all the data (if you’re an operator, say, you might look at failed tests for an instrument)
Best to let the data user select what level of quality they’re interested in
For EnviroSensing, is there a place to save/share code for QC tests?
Not at the moment
Would be good to start tracking/storing code for tests that people do somewhere and have a DOI that describes the processing
look here --> https://github.com/ioos/qartod and/or here --> https://github.com/asascience-open/QARTOD for what exists now

04/04/2016

Attendees (small group):
Don Henshaw
Mark
Corinna
Scotty Strachan
Janet Fredericks

Minutes:

Best practices work:

Post a citation suggestion on the introduction page of best practices? Maybe also a snapshot PDF that folks could download with a citation?
People DO go to this page, says Scotty
Mountain Research Institute (MRI) is doing some work trying to write best practices, would be good to try to get them to use ours rather than create something entirely new…

Future of EnviroSensing cluster:

Just need to have one vision at a time--for now we can stay focused on the XDOMES work
Scotty wants to continue to stay involved, promote cluster and its work a bit more to folks after finishing up his PhD (soon!); would be interested in taking lead in cluster after that, too

Future meetings:

Mark will present material on real-time quality control that was planned for this month on the next call instead, due to low attendance on this call
Would be great to reiterate that we didn’t start as ONLY LTER--the best practices doc had input from other folks too--and that that isn’t where we have to stay, either--we can continue to incorporate other things/groups
Janet will talk about summer meeting workshop plans on June call: registering vocabulary
will be a good workshop for beginners/people who need to be introduced to the concept
Summer meeting (July): have a 1.5-hour session for EnviroSensing--should email asking for folks to present

03/07/2016

Attendees:
Alison Adams
Don Henshaw
Janet Fredericks
John Porter
Felimon Gayanilo
Krzysztof Janowicz
Carlos Rueda
Peter Murdoch
Vasu Kilaru
Scotty Strachan
Corinna Gries
Ethan McMahon

Minutes:

1. Future meeting ideas (Don)

Revisit best practices--have folks present the chapters they did and rekindle interest; Scotty said he’d be willing to do this for his chapter
Email Alison at alison.adams@uvm.edu if you have ideas for future telecons!

2. Summer meeting -- session/workshop ideas? (proposals due at beginning of April)

XDOMES workshop: connected to EnviroSensing cluster?
sensor provenance EnviroSensing breakout session?
Janet to lead more hands-on workshop on Semantic Web, etc.?
Let Alison, Don, or Janet know if you have additional ideas

3. Update on work plan draft (Don & Janet)

Conversation with Erin last week
If you might like to be the next leader, let Don know--thinking of having an “on deck” leader position
Interest in AGU Geospace blog? Lots on data use, etc. This could be a place to put out info about our cluster

4. Rotating chair position for Products & Services (Alison)

Right now, Products & Services does three things: (1) FUNding Friday, (2) the P&S Testbed, and (3) tech evaluation process with NASA AIST to evaluate AIST-funded projects. Also working on an evaluation process for the projects funded through the testbed, and would hopefully provide this to the Earth science community at large eventually.
P&S wants to have a rotating co-chair position; would last for three months and would be a rep from a different committee. The first co-chair (starting in April) wouldn’t be involved in proposal evaluation/selection, but would be involved in ideas for student/PI matchmaking and mentoring for FUNding Friday and the evaluation and testbed activities. It would be a great opportunity to learn more about P&S and have them more about us. It wouldn’t prevent you from submitting a proposal to the testbed.
If you’re interested, email Soren at sorenscott@gmail.com; you can sign up for the rotating co-chair position here

5. XDOMES team on Semantic Web (Krzysztof Janowicz)

Full recording with slides available in recorded meeting
Semantic technologies can improve semantic interoperability
More intelligent metadata, more efficient discovery, use, reproducibility of data; reduce misinterpretation and meet data requirements of journals, etc.
Large community in science, gov’t, industry; many open source and commercial tolls and infrastructures; there are existing ontologies and huge amounts of interlinked data
Move the logic OUT of software and INTO the data
Semantic technologies support horizontal as well as vertical workflows
Semantic interoperability can only be measured after the fact bc meaning is an emergent property of interaction; come up with technologies that prevent data that shouldn’t be combined from being combined
Discussion:
How much of this is already in use or is this just setting up the platform at this point?
With XDOMES we are at an early stage, but in terms of the infrastructure, a lot of it is already productively used

02/02/2016

Attendees:
Bar (Lindsay Barbieri) - Data Analytics Student Fellow
Corinna Gries (North Temperate Lakes LTR site)
Janet Fredericks (Woods Hole, managing coastal observatory and leading XDOMES project)
Vasu Kilaru (EPA Office of Research and Development)
Josh Cole (UMBC Data Manager -- interested in hearing about XDOMES)
Scotty Strachan (Geography Dept at University Nevada Reno, current PhD student)
Pete Murdoch (Science Advisor for NE Region of USGS, also working with DOI)
Ethan McMahon (EPA Office of Environmental Information)
Renee Brown (Univ of New Mexico, Sevilleta LTER and Field Station)
Jane Wyngaard (Post-doc at JPL)
Don Henshaw (Forest Service PNW Research Station)

Notes drafted by Alison Adams (EnviroSensing Student Fellow) according to meeting recording

On the agenda:
Janet on XDOMES
What should EnviroSensing cluster do?


XDOMES (Cross-Domain Observational Metadata Environmental Sensing Network) NSF project funded as part of EarthCube

  • cross-domain observational metadata
  • focusing on: sensor metadata creation, data quality assessment, sensor interoperability, automated metadata generation, content management
  • 4 types of work (funded for two years)
    • software development to create Sensor ML Generators and registries for semantic techs and Sensor ML documents
    • Trying to engage people like this community, sensor manufacturers to create a network of people interested in promoting adoption of sensor ML techs
    • Useability assessment
    • Integration of semantic interoperability
  • this project is about capturing metadata at the time of CREATION of the data
  • GOAL: put out metadata automatically in interoperable ways (community-adopted, standards-based framework)
  • use of registered terms to enable interoperability
  • can also help manage operations by providing standardized information about how sensors were configured, etc.
  • want to develop community within ESIP who will follow and promote this approach
    • vet, look at usability assessment, what do you want us to do and is it useful to you?

Discussion

  • Vasu says this is related to things EPA has been working on--excited to continue the conversation
  • Jane asked about communication with sensor-creators so this can be implemented in new sensors; Janet said that isn’t part of the project at this point
  • Potential application for folks with Alliance for Coastal Technologies--they work on sensor validation and I think it’s a slightly different take, but this is definitely something they could benefit from (like if we start describing our sensors in a way that the information can be harvested)
  • If people are interested you can see what Janet is putting in the file and think about whether this would be helpful or useful or whether she’s missing something; would you be willing to test this in your own environment?
  • Delivering data is kind of beyond the scope of the XDOMES project at this point
  • Vision: info is out on the web, and you have an app to go and harvest the information that you need
  • Do we want to work this pilot project into public presence in this cluster?

Future WebEx ideas

  • Could do a WebEx on how water ML, sensor ML, etc. fit together
  • Janet is approaching Krzysztof Janowicz to discuss sensors and semantics with us on the March call

06/22/2015

Today we had Dave Johnson from LiCOR Environmental speak with us about eddy covariance. We had a large group of attendees.

Dave shared with us links to the LiCOR software [1] and a book [2]

  • The problem they are solving is that the data coming in with eddy covariance is high resolution and large, more than 300 million records per year.
  • They were able to put much of the processing and QC logic into the SMART system, which is on the instruments in the field.
  • An eddy covariance system can "see" about 100 x its height above the canopy.
  • We were interested in the possibilities for real-time data qc, and also how the information can be transferred between the instruments in the field (i.e. cell modem, line, etc.)

02/24/2015

Today we had William Jeffries from Heat Seek NYC talk to us about his platform for civilian monitoring of home temperatures in apartments using low-cost sensors and wireless mesh networks taking hourly readings.

Attending the call were Fox and Don from Andrews, Josh at UM, Jason in Fairbanks, Becky from Onset, Ryan from UT and Scotty from UNV as well as some listen-in callers.

  • Heat seek NYC has nodes containing up to 1000 low-power, low-cost thermometers which are mapped to custom printed circuit boards. The software stack is Ruby on Rails with a Post-Gres backend.
  • HSNYC focuses on sensor network's ability to affect and effect policy, and so far has seen that the government views sensor networks as a cool solution to a lot of difficult regulatory challenges. They provide reliable information that officials and citizens can use with printable, court-friendly PDF's.
  • Funding is by kickstarter and was started by the NYC Big Apps campaign, a giant Civic Hacking Competition.
  • They have been incorporated as a non profit in NY since 2014
  • First sensors were Twine Sensors off the shelf temperature sensors; not so good

stale data

  • Now they make their own temperature sensors, using a push system instead of a pull system. (if there's a problem with the system just don't get pull out of it.)
  • We asked about LCD and the reason for not implementing is that many tenants aren't actually as interested in the data as the response from the policy makers
  • Some people will abuse the system so they use tamper evident tape and photos of installation to protect ; landlords have financial incentives and there can be intentional lack of repairs
  • Local cache-ing of the sensor occurs; sensors cache at the hub, relay server that sends back
  • There are several levels of caching, and also well as flash memory on the XP radios
  • A key question was is there a long-term storage that is a local cache? We don't have a solution either
  • We had interest in a company called H20degree
  • For QC they do indoor and outdoor temperature comparison, comparing the sensors to one another and to store bought sensors before and after installation...
  • Frequency for radios/caching is hourly waking up point data
  • We are all interested in a low-or-no power source/ transmitter, maybe based on raspberri pi
  • need a high peak power envelope-- transmission needs to deal with being fairly far apart... implementation needs to come online and transmit an awesome signal for a short time and go back to sleep.

Note on Webex recording: I am going to check into the recording- my system claims to have been recording but it is nowhere to be found (Fox)

01/27/2015

Rick Susfalk from the Desert Research Institute presented about Acuity Data Portal. Notes from meeting taken by Fox Peterson, please edit as you see fit.

We had in attendance 8 persons.

Acuity Portal System

  • Started in 2006
  • originally VDV data solution
  • improvements to web-interface; sits on top of VDV as Acuity server
  • Acuity is a continuous monitoring of key client-driven data
  • it includes sensors and data logging deployment and maintenance, telemetry, data storage and analysis, automated airing, web portal for data access
  • individualized web presence tailored to client needs
  • not a single tool, but instead integrates commercial, open source, and proprietary hardware and tools
  • customizable project specific descriptions
  • common tools used to provide rapid, cost-effective deployment of individualized portals
  • physical infrastructure is shared amongst smaller clients for cost-saving or it can be segregated for larger clients.
  • access is controlled down to the variable level-- "we can define who gets to see what"-- for example, public can not see some features
  • one view could be "pre-defined graphs" without logging in, but if you want to download the data you must log in at your permissions level

SECURITY


ANALYSIS

  • customized thresholding and data-freshness
  • trending alerts, for example, know if battery will go bad
  • stochastic and numerical modeling
  • scoring incoming data for QA/QC processing

QA/QC tools

  • web-based GUI
  • users and managers can create, edit, and modify alerts online
  • groups can be created so that you can schedule management and alerts
  • also offers localized redundant alerting
  • two-way communication with the campbell data loggers (cr1000)
  • more about getting the data to the data managers for more in-depth QA/QC than about providing that part of the tooling

Data graphing features

  • Pre-determined graphs for basic users
  • Data selector for more advanced users
  • "we don't know what the users want to see so we give them the tools to do it" (good idea!)
  • anything you can change in Excel you can change in their graphs on the website -

Links

Metadata

  • relates your parameters to the network and what other sensors are doing
  • current system is getting more flexible
  • metadata is still largely user responsibility

Flight plan - safety tool

  • field personnel are data
  • users put in the travel time for safety
  • buddy system, alerted right before you return, then calls boss etc. Many levels hierarchy

Demos

  • Portals that are monitoring things
  • ability for data refreshment
  • colors for indication, ex., data would not be gray if there was lots of new data
  • users can change the settings on the data logger
  • scrolling, scaling, plotting, etc. via interaction with the user
  • can save your own graphs

graphs and alerts

  • many parameters
  • you can save!
  • email, sms, phone
  • default settings for users
  • lots of personnel management tools in this in general
  • cross-station "truly alarm or not" if station1 has a value but station2 has a different one, don't alarm sorts of rules
  • lists/user groups appear to be very important with this tool
  • sensor and triggers: customize one or more parameters that you are bringing into your database

real-time updates on loggers

  • ex. 10 minute data, user comes in and makes a change, the information is saved to the database and then is presented to all other users
  • the person will request a change and say what that change is
  • when there are different levels of connectivity ie. analog phone modems, before the data has the chance to work its way back into the system there is a lot of validation being done

example

measurements/metadata

  • extends beyond the vdv, more than 1 .dat file
  • integrate multiple .dat into many tables
  • managed by the data managers at DRI
  • workflow :

logger net --> vdv --> acuity, ok, let's give access to the DM for all these variables, click on it ok now the manager can see it --> generate an excel file with tables for all this metadata --> enter the data into the excel files--> send back to acuity --> injests, runs queries, back to db-- > metadata in bulk, quickly

  • we asked if the system ends before the qa/qc process begins, answer: qa/qc is done at the DRI, near real time QA though

future capability

  • direct the managers to the future data problems
  • manual decision making

Scotty asked about the duration (long and short term) of projects and how affects funding. most is funded by long-term projects; this is why they do the stats and numerical methods in the future

Amber asked about pricing; pricing is by hour to get up, then a price for maintaining the system for the duration of the project 5-10 .dat files, only 8 hours of person time at DRI to make a portal

11/25/2014

Jordan Read presented the SensorQC R package

Recorded Session Play | Download

10/28/2014

Wade Sheldon presented the GCE Data Toolbox – a short summary follows:

  • Community-oriented environmental software package
  • Lightweight, portable, file-based data management system implemented in MATLAB
  • generalized technical analysis framework, useful for automatic processing, and it's a good compromise using either programmed-in or file-based operations
  • Generalized tabular data model
  • Metadata, data, robust API, GUI library, support files, MATLAB databases
  • Benefits and costs: platform independent, sharing both code and data seamlessly across the systems, version independent as far as MATLAB goes, and is now "free and open source" software. There is a growing community of users in LTER.

Toolbox data model


  • Data model is meant to be a self-describing environmental data set-- the metadata is associated with the data, create date and edit date and such are maintained, and its lineage.
  • Quality control criteria- can apply custom function or one already in the toolbox
  • Data arrays, corresponding arrays of qualifier flags -- similar to a relational database table but with more associated metadata

Toolbox function library


  • The software library is referred to as a "toolbox"
  • a growing level of analytical functions, transformations, aggregation tools
  • GUI functions to simplify the usage
  • indexing and search support tools, and data harvest management tools
  • Command line API but there is also a large and growing set of graphical form interfaces and you can start the toolbox without even using the command line

Data management framework


  • Data management cycle - designed to help an LTER site do all of its data management tasks
  • Data and metadata can be imported into the framework and a very mature set of predefined import filters exist: csv, space- and tab-delimited and generic parsers. Also, specialized parsers are available for Sea-Bird CTD, sondes, Campbell, Hobo, Schlumberger, OSIL, etc.
  • Live connections i.e. Data Turbine, ClimDB, SQL DB's, access to the MATLAB data toolbox
  • Can import data from NWIS, NOAA, NCDC, etc.
  • Can set evaluation rules, conditions, evaluations, etc.
  • Automated QC on import but can do interactive analysis and revision
  • All steps are automatically documented, so you can generate an anomalies report by variable and date range which lets you communicate more to the users of the data

Recorded Session Play | Download

9/23/2014

  • Fox Peterson (Andrews LTER) reported on QA/QC methods they are applying to historic climate records (~13 million data points for each of 6 sites).

The challenge was that most automated approaches still produced too many flagged data that needed to be manually checked. Multiple statistical methods were tested based on long-term historical data. The method they selected was to use a moving window of data from the same hour over 30 days and test for 4 standard deviations in that window; E.g., use all data for 1 pm for days 30 - 60 of the year, compute four standard deviations, and set the range for the midpoint day (45) at the 1pm hour to that range.

  • Josh Cole reported on his system, which is in development and he will be able to share scripts with the group.
  • Brief discussion of displaying results using web tools.
  • Great Basin site discussed the variability in their data, which "has no normal"-- how could we perform qa/qc based on statistics and ranges in this case?
  • Discussion of bringing Wade Sheldon to call next time / usefulness of the toolbox for data managers
  • Discussion of using Pandas package- does anyone have experience, can we get them on?
  • Discussion of the trade off between large data stores, computational strength, and power. Good solutions?
  • ESIP email had some student opportunities which may be of interest
  • Overall, it was considered helpful if people were willing to share scripts. Discussion of a GIT repository for the group, or possibly just use the Wiki.

Recorded Session: Play | Download

8/26/2014

Suggestions for future discussion topics

  • Citizen Science contributions to environmental monitoring
  • 'open' sensors - non-commercial sensors made in-house, technology, use, best practices
  • Latest sensor technologies
  • Efficient data processing approaches
  • Online data visualizations
  • New collaborations to develop new algorithms for better data processing
  • Sensor system management tools (communicating field events and associating them with data)

Recorded session: Play | Download