EnviroSensing Monthly telecons

From Earth Science Information Partners (ESIP)
Revision as of 14:00, February 9, 2018 by Mdbartos (talk | contribs) (→‎12/05/2017)

back to EnviroSensing Cluster main page

Telecons are on the first Monday of every month. Click on 'join'

Schedule

  • Next telecon: June 6, 2017 at 5:00pm EST

Notes and recordings from past telecons

To listen to recordings from past telecons, click here. To log in, use the username guest@esipfed.org, and the password Earth111.

2/06/2017

Attendees
Don Henshaw
John Porter
Scotty Strachan
Vincent Moriarty
Mike Daniels
Renee Brown
Aaron Botnick
Corinna Gries

Introduction

Recap of winter meeting

About 20 participants, from organizations including CUAHSI, EPA, etc.
Talked about sensor data workflow.
Gave survey on sensor data workflow.
Is this survey a useful product?

Comments on survey

Renee: In favor of distributing to a broader audience (LTERs, LBFS).
Renee: Few people using trusted repositories.
Mike: CHORDS trying to make use of trusted repositories more accessible.
Renee: Everybody is using different solutions. What's use in developing own solutions, as opposed to standards?
Corinna: DataONE is not a trusted repository; it is an aggregator.
Corinna: Coming up with a data model that works for everybody is too ambitious. Sometimes prevents people from properly using them.
John: Why so many solutions?: scale and resources, similarity of objectives, things change very fast.
John: As for vocabulary, scaling can be an issue. Process can become very difficult if number of terms is too large.
Don: Are we seeing any commonality of approaches?
Don: Were we supposed to be identifying bottlenecks in the survey?
Scotty: Some bottlenecks include: QC flagging process. Import configuration process. Overall development of modified data model.
John: Possibly only use a subset of the survey as a planning tool.
Scotty: Could be useful for a team of, say, civil engineers who are trying to write a proposal and figure out what tools they need. Most will not be thinking of trusted repositories, e.g.
Mike: In NSF community, more emphasis on taking more data and more measurements, as opposed to what is done with those measurements (are they well-documented, etc.).
Corinna: Don't give as many choices on survey, may be to constraining. PI's won't be planning at this level of detail, e.g.
Aaron: The more generic a tool is, the harder it is to learn.
Conclusions: trim down diagram to get rid of specifics. Focus on general technologies used.
ODM seems to be taking ODM in a lot of new directions---goal of end-to-end sensor data management?
Corinna: could be too broad. Hard to get data in, hard to get data out.
John: Standardized response blocks. Smaller number than in current survey.
Headers instead of diagrams.


12/05/2017

Attendees
Matt Bartos
Scotty Strachan
Don Henshaw
Paul Celicourt
Renee F. Brown
Jane Wyngaard

Discussion of plans for the winter meeting

Scheduled a working session for the cluster with John Porter
Use this session to sit down, talk about workflows for end-to-end sensor data management
Everyone will hopefully have a cross-section of where high-performance/gaps in performance
Don: See what software people are using in the field, what loggers are used, how data is sent back to base station?
Scotty: Have a system diagram available for everyone to show their workflows and see where they can use particular tools.
Paul: Is the working session going to be interactive?
Scotty: Depends on how many people show up. If there are 20 people, e.g., have a brief synopsis, have an exercise in which people explore different architectures and see how their workflow fits in.
Else if there are very few people, go around and describe for each user.
Don: Have you described that in the abstract?
Scotty: we kept it general.
Don: We should say "attendees should come prepared for participation"
Scotty: Concurrent with OGC, metadata carpentry, earth data science analytics, valuables consortium session (and one more)

We we should do this next year:

Alternating monthly phone calls:
(i) cluster work, potential publications
(ii) Guest speakers present 20-30 minutes on an aspect of their research
Mature workflows
Completely different workspaces. Preferably earth sciences, but different.
Renee: Would like to see NEON again.
Scotty: Celeste had sensor network destroyed by wildfires (had insurance); would be interesting to see how they rebuild it.
Renee: What topics would you all be interested in?
Don: Touch base with the group and see if they have any updates.

Other things:

Jane: How does the winter session fit into the FAIR data concept.
Scotty: Look at how well workflow fits into the FAIR concept. Large strategic impact is FAIR concept; tactical focus is how to do you manage increasing volumes of data with decreasing person-hours.
Jane: OGC will be at this winter meeting.
Scotty: Many groups are using a modified ODM CUAHSI data model. Similar to what Jane is using.
Jane: Can likely find speakers in RDA groups.
Don: Sustainable data management group has an ESIP wiki with notes on FAIR data.

Skip January telecon; reconvene in February.

11/07/2017

Attendees
Ethan McMahon
Felimon Gayanilo
Janet Fredericks
Renee F Brown
Colin Smith
Corinna Gries
Don Henshaw
Matt Bartos

Preliminary remarks
senior project on CHORDS at University Nevada Reno
Continuation of discussion on cluster goals: summary
Time to start summarizing key software tools, seeing how they can fit together and where the gaps are
If someone has to spin up a new sensor network from scratch, how could they use a set of tools to streamline the process
How to have information be interoperable?
Submitted a working session to the ESIP Winter Meeting
To plan for taking first step forward over the next couple years
Co-organizers are Scotty and John Porter
Renee stepping up to be a new co-chair for the Envirosensing Group
What would you suggest to a brand-new network manager on what they should do, what tools should they use?
Literature review for each of the main boxes
Corinna: most interesting part to talk about are the arrows.
Define what the arrows represent
How many arrows could sensorML satisfy?
More MLs than sensorML:
WaterML, EML
Data Turbine as a vehicle?
Where would it fit in, does it fit in?
Hasn't been used, because Campbell provides all the services in the management interface.
Has anyone done a study on how long it takes to set up and configure a sensor data management system?
There are some aspects of QC that can slow down transfer of streaming sensor data to the database
OGC SOS and OpenSensorHub handles many of the things on the diagram
Ethan McMahon: Is there a general diagram?
http://im.lternet.edu/resources/im_practices/sensor_data
http://im.lternet.edu/node/963

Take homes:

- Scotty: the Cluster needs to define a topic objective with specific goals for the next 1-2 years. We have been spending most of our effort looking at software tools and workflows for sensor network scientists and managers. We could detail some of this, enhancing the resources already available in the Best Practices. Our goal should be to reduce IM time spent on managing data, to increase scalability and focus more on quality. Streamlining the setup/config/monitoring processes is key, while including community standards and external connectivity.

- Updating a workflow model graphic would give helpful guidance, modify Scotty's?

- Matt: a literature review of each sector of the workflow (tools, practices) would be very helpful, and we could add those to new Guide documents.

- Corinna: the challenge is in defining what the flows between software elements consist of (formats/contents/sizes/standards). The group could spend worthwhile time on that.

- Matt: SensorML as a primary communications medium between softwares - how much does it already cover, and could it use improvement?

- We could take a "use case approach", where as an exercise we have IM's fill in the blanks on a model graphic with their solutions, highlight the bottlenecks, and make suggestions or wish lists. - Felimon: we should have a presentation by 52north, they may have solved many of these problems or have good component solutions?

- Other resources to explore as we set up this topic: OpenSensorHub, SensorML, EML, WaterML, OGC-SOS (standards)

10/03/2017

Attendees
(Incomplete list) Don Henshaw
Scotty Strachan
Matt Bartos
Janet Fredericks
Ethan McMahon
Jane Wyngaard
John Porter
Felimon Gayanilo
Vasu Kilaru

Intro - Scotty
software framework for data management
working on proposal
but could have community projects
diagram shared by Scotty
where does FRAMEWORK fit in - dashed line
into structured database a la CUASHI
GCE toolbox example of stand alone software
description of diagram
interfaces to sensor systems
database structure built along community standards
metadata and data proceed in parallel
harkens to FAIR principles for data management
discussion
definitely worthy of groups consideration
interested in relation with OGC and sensorWeb
Corinna input?
I was hoping Jeff would jump in, he's done most of this from the CUASHI perspective
EDI can't do all of what is in diagram
more interested in matchmaking
want as modular as possible so we can insert expertise into particular areas
Jeff Horsburgh
knowing what we've worked on
many pieces worked on as open source
utility is variable
community to work on them would make it better
discussion
still a gap , hard to spin up CUASHI software
database structure & standards - should follow what has already been done
worthy goal to pursue?
is this thinking too big for the cluster attention?
Two questions
some providers take the lions share of IOT platform - Azure, Google, IOT etc. role in providing interface for embedded devices
what external connections would you like to integrate with
open question
could use a wider array of definitions for inputs
brining in metadata from a community should only require setting up once
include built-in test cases - automated continuous development process

might be best to make end-to-end system work with most popular systems real-time applications

probably not - in external connections is near real time
lessen work for IMs etc.
or focus on practices
GCE toolbox actually covers a LOT of this ground
but built on top of Matlab
specialized metadata
can use some "lessons learned" from it
as Corinna said, would like to see modular with well-documented interfaces, perhaps as web services
could go "box by box" for discussions of what is needed and what is already working
develop some use cases
develop cluster goals
need to focus on interface/APIs between boxes
that allows others to build code on either end independently
Like modular approach - talking about process boxes
AND uses diverse software as processing
Loggernet to GCE Toolbox to CSV to DB
could fill out - broaden input
identify and move forward on generic pieces
XDOMES at Winter Meeting - will this topic be discussed there
want outreach to make a larger group available
would have no problem proposing session at winter meeting
we know how to do this as individuals - but tools for new person don't exist yet
would be good to leave some blank boxes in framework and let others fill it out
modularity sounds good - importance of pieces varies between groups
would like a plug-and-play framework
overlaying pieces on framework would identify holes and what plays well together
Would like both higher and lower level documents
need to communicate with managers and programmers
Next call
run through ideas on how data flows from place to place
identify places where other boxes are needed

09/05/2017

Attendees
Don Henshaw
Scotty Strachan
Matt Bartos
Renee Brown
Cove Sturtevant
Cody Flagg
Janet Fredericks
Ethan McMahon
John Porter
Felimon Gayanilo

Recap of summer meeting

Janet's session
Wade's session
Funding Friday Prize

Cove Sturtevant (NEON) - Mobile Applications for maintenance and Field Data

Instrumented Systems
IS Science Data Quality Monitoring - automated flagging of data
Adding separate system for sensor health monitoring
Quality monitoring application
Science review
Rolling analyses
Maintenance records
  • Q: Where do you put in biases and offsets for individual sensors?
  • A: This is done in calibration lab.


  • Q: Are standard QAQC measures used? (ioos.noaa.gov/project/qartod)
  • A: Not yet.


  • Q: Are technicians viewing this in field on their phones? Do they need access to cell/wifi?
  • A: Generally use ruggedized tablets. Can pre-download data if there is no connectivity.


  • Q: Where do you get the barcodes from?
  • A: We make them.


  • Q: How long does it take to develop apps?
  • A: Single applications can be developed in less than a day. Some applications are drag and drop and can be made in a couple hours.


  • Q: What is the cost of fulcrum
  • A: Standard, Business, Professional. Standard is $18 per device/user per month.

06/06/2017

Attendees
Don Henshaw
Matt Bartos
Scotty Strachan
Renee Brown
Wade Sheldon

Announcements
Renee - McMurdough LTER Data Manager

Discussion of Summer Meeting Sessions

No response from DataONE yet
Four speakers right now: Matt Bartos, Mike Daniels, Mike Botts, Wade Sheldon

Poster draft

Cluster emphasis
End to end sensor data management
Best practices for deployment are fairly mature

Questions

2 & 3 Priority for envirosensing cluster
Raw vs. curated / snapshot vs. linking / strategies for metadata / sensorML vs. tabular / provenance -- how to link back to earlier versions, which data to keep
Skip July meeting; skip August; reconvene in September

05/02/2017

Attendees
Don Henshaw
Amber Jones
Matt Bartos
Felimon Gayanilo
Janet Fredericks
Martha Apple
Jason Downing
Paul Celicourt

Introduction

Discussion of Envirosensing Panel

Panelists

Confirmed
LTER
Data ONE
NCEI
Possible
EPA/NASA
ARS (agricultral research stations)
Earthcube
National snow and ice data center

Questions for panel

  1. Data submission protocols
    1. What are the proper protocols and standards for submitting data to repositories?
    2. e.g. should data be sent as snapshots or streams?
  2. Integration with real-time streaming sensor networks
    1. What APIs are available for automatically pushing data to a repository?
    2. How can repositories encourage participation from small research labs that maintain their own sensor networks?
  3. Data Quality
    1. Should repositories have a role in assuring data quality?
    2. What type of quality control should be performed before submission to repositories?
    3. Should repositories provide checks for data quality?
    4. How should storage of metadata associated with data quality be handled?
  4. Data curation
    1. Who should be responsible for data curation: submitter or publisher?
    2. Would it be helpful to have an external rating system for data quality/usefulness?
  5. Data duplication
    1. What is the proper way to deal with syncing and duplication of datasets across repositories?

04/04/2017

Attendees Matt Bartos
Scotty Strachan
Mike Daniels
Jason P. Downing
Renee F Brown
Don Henshaw
Bruce Caron
Mike Botts
Wade Sheldon
Amber Jones
Ethan McMahon
Janet Fredericks

Introduction

CHORDS: Cloud-hosted real-time data services for the geosciences

Real-time data is of critical and growing importance in the geosciences
Necessary for hazards like floods, earthquakes, etc.; but also field experiments
Enhances rates of data transfer from the field will improve data quality and research outcomes
Organizations like NCAR have great real-time visualization tools, but the data are not easily accessible
Small research teams are taking valuable measurements that could also be of broad benefit
However, these data often aren't accessible to the broader community
Case studies:
Studying evaporation in the great lakes
Using infrasound to detect severe weather
Volcano monitoring in Tanzania
Crowdsourced real-time data helping to measure and predict earthquakes
Web of sensor data can be challenging to manage
Varying spatial scales, flags, metadata
Most scientists don't want to spend time reading standards
Enter CHORDS
Chords emphasizes simple ingest and access to real-time data
Meant for scientists who want to spend time doing science rather than managing data.
Can be set up using Amazon Web Services by a lay-user.
Data is pushed using simple HTTP GET requests.
Live demo of portal
Implementation details
SensorML used to register each site.
Data fetch via geojson, csv, etc.
Data stored in influxdb, MYSQL used for metadata
Connects to grafana for visualization.
Version 1.0 scheduled for October 2017
Automatic DOIs for data
Implementing OGC standards
Event triggers
CHORDS architecture
Portals operated by individuals feed into processing, translation, mapping services.
Workflows to integrate with archiving services.
URL
http://ncar.github.io/chords/
Discussion on plans for the summer
Decided on a breakout session, along with a panel.
Breakout group will focus on end-to-end sensor systems.

03/06/2017

Attendees:
Don Henshaw
Matt Bartos
Andrew Rettig
Ethan McMahon
Janet Fredericks
John Porter
Mary Martin
Paul Celicourt
Scotty Strachan
Vincent Moriarty
Martha Apple
John Andersen


Notes


Paul Celicourt - An end to end automated environmental data collection system

Objective: develop an integrated data acquisition system.
Incorporates sensing, data management, publication and analysis into the same package.
Secondary objectives:
Self-organized sensor network.
Platform-independent and protocol-agnostic.
Software application to encode and decode sensors and sensor platform descriptions.
Hardware supports most popular data interfaces.
Data publication in different formats and unit systems.
Hardware costs less than $200.
System operation and network organization
TEDS are used as a mechanism to provide metadata to each station prior to deployment.
Use CUAHSI ODM data format and Django Web Framework for automated data management.
Field deployment in Brooklyn.
Uses Zigbee protocol.
Software tools: PyTED. Sensor description using IEEE 1451 standards.
Software tools: HydroUnits. Dimensional analysis in Hydrologic computing systems using sensor-based standards.
Summary
System is capable of handling data collection, transmission, management and publication.
Effective in reducing field data acquisition workload and reducing human errors.
Currently developing an online configuration and programming tool.


Janet Fredericks - Update on XDOMES project

SensorML editor working, but need to develop more manufacturer-friendly vocabularies.
Want to encourage sensor manufacturers to create content.
Sensor manufacturer suggested not only creating document, but also an ID for each sensor that can be used by data managers.
Link to sensor registry spreadsheet added to wiki page.
Schedule time for showing sensorML editor at upcoming telecon.

02/06/2017

Attendees:
Scotty Strachan
Matt Bartos
Don Henshaw
Janet Fredericks
John Porter
Paul Celicourt
Vasu Kilaru


Notes

Matt Bartos - Wireless sensor networks for smart water infrastructure

Overview of research efforts underway at the University of Michigan Real-Time Water Systems Lab.
Description of wireless sensor node hardware and data backend.
Two ongoing applications:
Using wireless sensor networks for real-time flash flood monitoring in the Dallas--Fort Worth Metroplex.
Using wireless sensors to optimize stormwater quality via automated control infrastructure in Ann Arbor, MI.


Janet Fredericks - Update on XDOMES

Current efforts
Overhaul so users can create re-usable models of sensors with rulesets.
Release of sensorML editor ongoing.
Goals for cluster
Create vocabularies that manufacturers can reference (e.g. sensor types, observable properties).
Sensor vocabularies should be domain-driven rather than manufacturer driven.
Remove ambiguities in vocabularies (e.g. beam strength vs. sensor strength).
Identify some example cross-domain sensors.
Immediate goals
By March meeting: Have sensor types and observable properties.
Make spreadsheet (SensorType and ObservableProperties) a google doc that is linked to from envirosensing page to invite more community participation.

Plans for next telecon

Doodle poll to set time for upcoming meetings (effective April).

12/05/2016

Attendees:
Don Henshaw
Janet Fredericks
Alison Adams
Scotty Strachan
Andrew Rettig (started work with ES cluster several years ago, stopped about a year ago; teaching ES networking at U of Dayton)
Erin Robinson
Janet Fredericks
Vasu Kilaru
Felimon Gayanilo
Eric Fritzinger (works with Scotty)
Matthew Mayernik

Notes:

Discussion of what is needed from community to enable semantic and syntactic interoperability

Sensor manufacturer responsible for making OEM model description
creates a unique identifier for each sensor
SensorML file is created and belongs to sensor owner
Data managers describe processing that's done
Documents reference terms that have meaning --> reference ontologies
Hoping to get community to help with development of ontologies, esp in communities Janet isn't a part of (i.e. not in oceanography comm'ty) -- then Janet can work with sensor manufacturers to reference those terms
Recorded presentation includes example ontology form; Wiki reference links should include date accessed
MMI ontology registry has a big vocabulary database
Can create new vocabularies and map to other known ontologies
Watch the video of the telecon for more information about how this works

Scheduling an informal lunch/coffee meeting at AGU in San Francisco (all)

Scotty added Janet's presentation schedule to his schedule for the meeting -- contact either of them if you want to meet up with them
Janet will send out this message to everyone

Next meeting (2/6/17)

Vasu and Janet talk about engaging sensor manufacturers

11/07/2016

Attendees:
Don Henshaw
Janet Fredericks
Alison Adams
Scotty Strachan
Mike Botts
Mike Dye
Carlos Rueda
Paul Lemieux
Renee Brown
Ethan McMahon

Notes:

Change in leadership

Scotty Strachan to begin co-leading EnviroSensing group with Janet Fredericks
good time to change leadership, Don is retiring soon, time to think about new directions (XDOMES? other projects?)
Matt Bartos (U Michigan) will take over as student fellow (Alison is done at the end of the year)

Discussion of new directions

Don - short recap of history of EnviroSensing cluster and original mission
this is review from previous telecons--see past notes
Janet - update on past year of work on XDOMES
ES group mission and goals is well-aligned with her work, but she needs people to participate rather than just be an audience for the work they're doing
e.g. looking for vocabularies... need this before can go to sensor manufacturers
having difficulty getting people involved -- need folks to participate!
no ES explicit presence (i.e. no specific sessions) at winter meeting, but intend to have a session at summer meeting
Don suggested more explicitly crafting this as a clear opportunity/desire to get people involved
good to tie to specific time frame, task, or event
Send out info prior to telecon, have people look at it and prep something specific, then discuss on telecon
maybe try to target specific people - perhaps folks can help ID who these people would be
follow-up meeting next month to try to get volunteers to describe sensors (spreadsheet?)
plan to have an informal ESIP/EnviroSensing get-together at AGU?
would be great to have a few volunteers, work up to people doing talks at ESIP Summer Meeting?
spend some time together talking about what this is, catching up on the cluster, etc.
Scotty - some thoughts on future directions
excited to step in and help out with the group, since it's been very helpful to expand his data management world
completed PhD over the summer, things are currently a little hectic--transitions, etc.
focus on field designs, esp. on mountain environments
could do multiple projects at once--would be good to stay focused and help the XDOMES project

06/06/2016

Attendees:
Don Henshaw
Janet Fredericks
Alison Adams
Jane Wyngaard
Scotty Strachan
Vasu Kilaru
Wade Sheldon
Mike Botts
Ethan McMahon

Notes:

Janet Fredericks on Q2O web service
Using SWE to bind metadata to observational data - enabling dynamic data quality assessment
Background: NOAA-sponsored project to address data quality in sensor web enablement frameworks

Janet walked participants through the Q2O (QARTOD-to-OGC) web service implementation to demonstrate how web services are used to describe processing, select and describe data and offer it to a user as a service.

OpenSensorHub

open sensor access so that people can discover sensor data, etc.
geolocatable/geospatially aware, fully described data, sensors, and processing
free open source software on github

SmartCity Air Challenge

Ethan: EPA is putting out a challenge called the SmartCityAir challenge: ask communities to tell them how they would deploy a large team of air sensors
provide seed money, etc.
groups have to describe how they would monitor and manage the data, deal with sensors throughout their lifecycle, do this sustainably…
gov’t learning what practices work best
will be announced formally next week; Ethan will send a less formal paragraph out to this group
interested in ideas for how to get word out about this, how to encourage people to use best data management practices that exist
focus is really on the data management side

With XDOMES and OpenSensorHub → trying to get all of that kind of knowledge (how it was used, description of data, etc.) so that you could have people throwing sensors out and have access to that data

Summer meeting:

Janet will be sending out an email about preparing for the workshop at the summer meeting → people can let Janet know that they’re coming
At Thursday session hopefully Janet, Mike, and Vasu will all be presenting
Maybe talk about whether we want to reach out to a broader audience? Janet would like to know how many people plan to attend the workshop
Poster? Don says we did one last year and could just substitute some of the stuff on there with summary points of what we’ve done this year on our calls -- Don will produce the poster if Janet sends summary slides from some of the talks to him

NO call in July -- we’ll just see each other at ESIP! Next call will be in August.

05/02/2016

Attendees:
Don Henshaw (Forest Service in Oregon)
Alison Adams (EnviroSensing Student Fellow)
Annie Burgess (ESIP)
Lindsay Barbieri (Student Fellow)
Carlos Rueda (software engineer at Monterrey, part of XDOMES team)
Corinna Gries (LTER)
Felimon Gayanilo (Texas A&M, XDOMES)
Janet Fredericks (WHOI, XDOMES)
Wade Sheldon (LTER)
Mark Bushnell (oceanographer, XDOMES)
Jane Wyngaard (JPL)

Notes:

Mark Bushnell – Quality Assurance and Quality Control of Real-Time Ocean Data (QARTOD):

QARTOD manuals: focus on real time, usually coastal (http://www.ioos.noaa.gov/qartod)
Includes quality control tests and quality assurance of sensors (in an appendix)
Discussion of operational vs. scientific quality control and different needs/contexts for each
Board meets quarterly to review progress and identify next variables (if you have ideas for variables, let Mark know!)
Each manual takes 6-8 months and each one is a living document that is updated
26 core variables
next up: phytoplankton species!
Discussed an example test from the waves manual
Five “states” for data qc flags (pass, not evaluated, suspect or of high interest, fail, missing data)

Discussion & Questions for Mark:

How to handle flags that represent a mix of semantic notions? Hard for data consumers to understand (what’s the actual problem?)
What about showing (or not showing) data that doesn’t meet a certain standard?
If you’re looking for extreme events, for example, you might want to see all the data…
Helpful to have the option to see all the data (if you’re an operator, say, you might look at failed tests for an instrument)
Best to let the data user select what level of quality they’re interested in
For EnviroSensing, is there a place to save/share code for QC tests?
Not at the moment
Would be good to start tracking/storing code for tests that people do somewhere and have a DOI that describes the processing
python codes for the implementation of the QARTOD recommendations is at https://github.com/ioos/qartod (also look here --> https://github.com/asascience-open/QARTOD)
Core link in email from Janet (body copied below) -- many other pages and additional information can be found from that link
Background: The U.S. Integrated Ocean Observing System® (IOOS) Quality Assurance/Quality Control of Real-Time Oceanographic Data (QARTOD) project has published nine data quality-control manuals since 2012. The manuals are based on U.S. IOOS-selected parameters (or core variables) that are of high importance to ocean observations. The purpose of the manuals is to establish real-time data quality control (QC) procedures for data collection for core variables, such as water levels, currents, waves, and dissolved nutrients. The QC procedures provide guidance to eleven U.S. IOOS Regional Associations and other ocean observing entities, helping data providers and operators to ensure the most accurate real-time data possible. It began as a grass-roots organization over a decade ago - the background can be found on the IOOS QARTOD Project website: http://www.ioos.noaa.gov/qartod/welcome.html
Links from Mark
The link for the flags crosswalk note is http://odv.awi.de/fileadmin/user_upload/odv/misc/ODV4_QualityFlagSets.pdf.
There's similar work at http://www.iode.org/index.php?option=com_oe&task=viewDocumentRecord&docID=10762
Tad Slawecki (tslawecki@limno.com) heads the informal QARTOD Implementation Working Group and posts minutes at https://docs.google.com/document/d/128xPGjTMBP9FC-SEg9vGe8bD145LGdapH21WV5smzn4/edit
In those minutes, he has a USGS github link at https://github.com/USGS-R/sensorQC

04/04/2016

Attendees (small group):
Don Henshaw
Mark
Corinna
Scotty Strachan
Janet Fredericks

Minutes:

Best practices work:

Post a citation suggestion on the introduction page of best practices? Maybe also a snapshot PDF that folks could download with a citation?
People DO go to this page, says Scotty
Mountain Research Institute (MRI) is doing some work trying to write best practices, would be good to try to get them to use ours rather than create something entirely new…

Future of EnviroSensing cluster:

Just need to have one vision at a time--for now we can stay focused on the XDOMES work
Scotty wants to continue to stay involved, promote cluster and its work a bit more to folks after finishing up his PhD (soon!); would be interested in taking lead in cluster after that, too

Future meetings:

Mark will present material on real-time quality control that was planned for this month on the next call instead, due to low attendance on this call
Would be great to reiterate that we didn’t start as ONLY LTER--the best practices doc had input from other folks too--and that that isn’t where we have to stay, either--we can continue to incorporate other things/groups
Janet will talk about summer meeting workshop plans on June call: registering vocabulary
will be a good workshop for beginners/people who need to be introduced to the concept
Summer meeting (July): have a 1.5-hour session for EnviroSensing--should email asking for folks to present

03/07/2016

Attendees:
Alison Adams
Don Henshaw
Janet Fredericks
John Porter
Felimon Gayanilo
Krzysztof Janowicz
Carlos Rueda
Peter Murdoch
Vasu Kilaru
Scotty Strachan
Corinna Gries
Ethan McMahon

Minutes:

1. Future meeting ideas (Don)

Revisit best practices--have folks present the chapters they did and rekindle interest; Scotty said he’d be willing to do this for his chapter
Email Alison at alison.adams@uvm.edu if you have ideas for future telecons!

2. Summer meeting -- session/workshop ideas? (proposals due at beginning of April)

XDOMES workshop: connected to EnviroSensing cluster?
sensor provenance EnviroSensing breakout session?
Janet to lead more hands-on workshop on Semantic Web, etc.?
Let Alison, Don, or Janet know if you have additional ideas

3. Update on work plan draft (Don & Janet)

Conversation with Erin last week
If you might like to be the next leader, let Don know--thinking of having an “on deck” leader position
Interest in AGU Geospace blog? Lots on data use, etc. This could be a place to put out info about our cluster

4. Rotating chair position for Products & Services (Alison)

Right now, Products & Services does three things: (1) FUNding Friday, (2) the P&S Testbed, and (3) tech evaluation process with NASA AIST to evaluate AIST-funded projects. Also working on an evaluation process for the projects funded through the testbed, and would hopefully provide this to the Earth science community at large eventually.
P&S wants to have a rotating co-chair position; would last for three months and would be a rep from a different committee. The first co-chair (starting in April) wouldn’t be involved in proposal evaluation/selection, but would be involved in ideas for student/PI matchmaking and mentoring for FUNding Friday and the evaluation and testbed activities. It would be a great opportunity to learn more about P&S and have them more about us. It wouldn’t prevent you from submitting a proposal to the testbed.
If you’re interested, email Soren at sorenscott@gmail.com; you can sign up for the rotating co-chair position here

5. XDOMES team on Semantic Web (Krzysztof Janowicz)

Full recording with slides available in recorded meeting
Semantic technologies can improve semantic interoperability
More intelligent metadata, more efficient discovery, use, reproducibility of data; reduce misinterpretation and meet data requirements of journals, etc.
Large community in science, gov’t, industry; many open source and commercial tolls and infrastructures; there are existing ontologies and huge amounts of interlinked data
Move the logic OUT of software and INTO the data
Semantic technologies support horizontal as well as vertical workflows
Semantic interoperability can only be measured after the fact bc meaning is an emergent property of interaction; come up with technologies that prevent data that shouldn’t be combined from being combined
Discussion:
How much of this is already in use or is this just setting up the platform at this point?
With XDOMES we are at an early stage, but in terms of the infrastructure, a lot of it is already productively used

02/02/2016

Attendees:
Bar (Lindsay Barbieri) - Data Analytics Student Fellow
Corinna Gries (North Temperate Lakes LTR site)
Janet Fredericks (Woods Hole, managing coastal observatory and leading XDOMES project)
Vasu Kilaru (EPA Office of Research and Development)
Josh Cole (UMBC Data Manager -- interested in hearing about XDOMES)
Scotty Strachan (Geography Dept at University Nevada Reno, current PhD student)
Pete Murdoch (Science Advisor for NE Region of USGS, also working with DOI)
Ethan McMahon (EPA Office of Environmental Information)
Renee Brown (Univ of New Mexico, Sevilleta LTER and Field Station)
Jane Wyngaard (Post-doc at JPL)
Don Henshaw (Forest Service PNW Research Station)

Notes drafted by Alison Adams (EnviroSensing Student Fellow) according to meeting recording

On the agenda:
Janet on XDOMES
What should EnviroSensing cluster do?


XDOMES (Cross-Domain Observational Metadata Environmental Sensing Network) NSF project funded as part of EarthCube

  • cross-domain observational metadata
  • focusing on: sensor metadata creation, data quality assessment, sensor interoperability, automated metadata generation, content management
  • 4 types of work (funded for two years)
    • software development to create Sensor ML Generators and registries for semantic techs and Sensor ML documents
    • Trying to engage people like this community, sensor manufacturers to create a network of people interested in promoting adoption of sensor ML techs
    • Useability assessment
    • Integration of semantic interoperability
  • this project is about capturing metadata at the time of CREATION of the data
  • GOAL: put out metadata automatically in interoperable ways (community-adopted, standards-based framework)
  • use of registered terms to enable interoperability
  • can also help manage operations by providing standardized information about how sensors were configured, etc.
  • want to develop community within ESIP who will follow and promote this approach
    • vet, look at usability assessment, what do you want us to do and is it useful to you?

Discussion

  • Vasu says this is related to things EPA has been working on--excited to continue the conversation
  • Jane asked about communication with sensor-creators so this can be implemented in new sensors; Janet said that isn’t part of the project at this point
  • Potential application for folks with Alliance for Coastal Technologies--they work on sensor validation and I think it’s a slightly different take, but this is definitely something they could benefit from (like if we start describing our sensors in a way that the information can be harvested)
  • If people are interested you can see what Janet is putting in the file and think about whether this would be helpful or useful or whether she’s missing something; would you be willing to test this in your own environment?
  • Delivering data is kind of beyond the scope of the XDOMES project at this point
  • Vision: info is out on the web, and you have an app to go and harvest the information that you need
  • Do we want to work this pilot project into public presence in this cluster?

Future WebEx ideas

  • Could do a WebEx on how water ML, sensor ML, etc. fit together
  • Janet is approaching Krzysztof Janowicz to discuss sensors and semantics with us on the March call

06/22/2015

Today we had Dave Johnson from LiCOR Environmental speak with us about eddy covariance. We had a large group of attendees.

Dave shared with us links to the LiCOR software [1] and a book [2]

  • The problem they are solving is that the data coming in with eddy covariance is high resolution and large, more than 300 million records per year.
  • They were able to put much of the processing and QC logic into the SMART system, which is on the instruments in the field.
  • An eddy covariance system can "see" about 100 x its height above the canopy.
  • We were interested in the possibilities for real-time data qc, and also how the information can be transferred between the instruments in the field (i.e. cell modem, line, etc.)

02/24/2015

Today we had William Jeffries from Heat Seek NYC talk to us about his platform for civilian monitoring of home temperatures in apartments using low-cost sensors and wireless mesh networks taking hourly readings.

Attending the call were Fox and Don from Andrews, Josh at UM, Jason in Fairbanks, Becky from Onset, Ryan from UT and Scotty from UNV as well as some listen-in callers.

  • Heat seek NYC has nodes containing up to 1000 low-power, low-cost thermometers which are mapped to custom printed circuit boards. The software stack is Ruby on Rails with a Post-Gres backend.
  • HSNYC focuses on sensor network's ability to affect and effect policy, and so far has seen that the government views sensor networks as a cool solution to a lot of difficult regulatory challenges. They provide reliable information that officials and citizens can use with printable, court-friendly PDF's.
  • Funding is by kickstarter and was started by the NYC Big Apps campaign, a giant Civic Hacking Competition.
  • They have been incorporated as a non profit in NY since 2014
  • First sensors were Twine Sensors off the shelf temperature sensors; not so good

stale data

  • Now they make their own temperature sensors, using a push system instead of a pull system. (if there's a problem with the system just don't get pull out of it.)
  • We asked about LCD and the reason for not implementing is that many tenants aren't actually as interested in the data as the response from the policy makers
  • Some people will abuse the system so they use tamper evident tape and photos of installation to protect ; landlords have financial incentives and there can be intentional lack of repairs
  • Local cache-ing of the sensor occurs; sensors cache at the hub, relay server that sends back
  • There are several levels of caching, and also well as flash memory on the XP radios
  • A key question was is there a long-term storage that is a local cache? We don't have a solution either
  • We had interest in a company called H20degree
  • For QC they do indoor and outdoor temperature comparison, comparing the sensors to one another and to store bought sensors before and after installation...
  • Frequency for radios/caching is hourly waking up point data
  • We are all interested in a low-or-no power source/ transmitter, maybe based on raspberri pi
  • need a high peak power envelope-- transmission needs to deal with being fairly far apart... implementation needs to come online and transmit an awesome signal for a short time and go back to sleep.

Note on Webex recording: I am going to check into the recording- my system claims to have been recording but it is nowhere to be found (Fox)

01/27/2015

Rick Susfalk from the Desert Research Institute presented about Acuity Data Portal. Notes from meeting taken by Fox Peterson, please edit as you see fit.

We had in attendance 8 persons.

Acuity Portal System

  • Started in 2006
  • originally VDV data solution
  • improvements to web-interface; sits on top of VDV as Acuity server
  • Acuity is a continuous monitoring of key client-driven data
  • it includes sensors and data logging deployment and maintenance, telemetry, data storage and analysis, automated airing, web portal for data access
  • individualized web presence tailored to client needs
  • not a single tool, but instead integrates commercial, open source, and proprietary hardware and tools
  • customizable project specific descriptions
  • common tools used to provide rapid, cost-effective deployment of individualized portals
  • physical infrastructure is shared amongst smaller clients for cost-saving or it can be segregated for larger clients.
  • access is controlled down to the variable level-- "we can define who gets to see what"-- for example, public can not see some features
  • one view could be "pre-defined graphs" without logging in, but if you want to download the data you must log in at your permissions level

SECURITY


ANALYSIS

  • customized thresholding and data-freshness
  • trending alerts, for example, know if battery will go bad
  • stochastic and numerical modeling
  • scoring incoming data for QA/QC processing

QA/QC tools

  • web-based GUI
  • users and managers can create, edit, and modify alerts online
  • groups can be created so that you can schedule management and alerts
  • also offers localized redundant alerting
  • two-way communication with the campbell data loggers (cr1000)
  • more about getting the data to the data managers for more in-depth QA/QC than about providing that part of the tooling

Data graphing features

  • Pre-determined graphs for basic users
  • Data selector for more advanced users
  • "we don't know what the users want to see so we give them the tools to do it" (good idea!)
  • anything you can change in Excel you can change in their graphs on the website -

Links

Metadata

  • relates your parameters to the network and what other sensors are doing
  • current system is getting more flexible
  • metadata is still largely user responsibility

Flight plan - safety tool

  • field personnel are data
  • users put in the travel time for safety
  • buddy system, alerted right before you return, then calls boss etc. Many levels hierarchy

Demos

  • Portals that are monitoring things
  • ability for data refreshment
  • colors for indication, ex., data would not be gray if there was lots of new data
  • users can change the settings on the data logger
  • scrolling, scaling, plotting, etc. via interaction with the user
  • can save your own graphs

graphs and alerts

  • many parameters
  • you can save!
  • email, sms, phone
  • default settings for users
  • lots of personnel management tools in this in general
  • cross-station "truly alarm or not" if station1 has a value but station2 has a different one, don't alarm sorts of rules
  • lists/user groups appear to be very important with this tool
  • sensor and triggers: customize one or more parameters that you are bringing into your database

real-time updates on loggers

  • ex. 10 minute data, user comes in and makes a change, the information is saved to the database and then is presented to all other users
  • the person will request a change and say what that change is
  • when there are different levels of connectivity ie. analog phone modems, before the data has the chance to work its way back into the system there is a lot of validation being done

example

measurements/metadata

  • extends beyond the vdv, more than 1 .dat file
  • integrate multiple .dat into many tables
  • managed by the data managers at DRI
  • workflow :

logger net --> vdv --> acuity, ok, let's give access to the DM for all these variables, click on it ok now the manager can see it --> generate an excel file with tables for all this metadata --> enter the data into the excel files--> send back to acuity --> injests, runs queries, back to db-- > metadata in bulk, quickly

  • we asked if the system ends before the qa/qc process begins, answer: qa/qc is done at the DRI, near real time QA though

future capability

  • direct the managers to the future data problems
  • manual decision making

Scotty asked about the duration (long and short term) of projects and how affects funding. most is funded by long-term projects; this is why they do the stats and numerical methods in the future

Amber asked about pricing; pricing is by hour to get up, then a price for maintaining the system for the duration of the project 5-10 .dat files, only 8 hours of person time at DRI to make a portal

11/25/2014

Jordan Read presented the SensorQC R package

Recorded Session Play | Download

10/28/2014

Wade Sheldon presented the GCE Data Toolbox – a short summary follows:

  • Community-oriented environmental software package
  • Lightweight, portable, file-based data management system implemented in MATLAB
  • generalized technical analysis framework, useful for automatic processing, and it's a good compromise using either programmed-in or file-based operations
  • Generalized tabular data model
  • Metadata, data, robust API, GUI library, support files, MATLAB databases
  • Benefits and costs: platform independent, sharing both code and data seamlessly across the systems, version independent as far as MATLAB goes, and is now "free and open source" software. There is a growing community of users in LTER.

Toolbox data model


  • Data model is meant to be a self-describing environmental data set-- the metadata is associated with the data, create date and edit date and such are maintained, and its lineage.
  • Quality control criteria- can apply custom function or one already in the toolbox
  • Data arrays, corresponding arrays of qualifier flags -- similar to a relational database table but with more associated metadata

Toolbox function library


  • The software library is referred to as a "toolbox"
  • a growing level of analytical functions, transformations, aggregation tools
  • GUI functions to simplify the usage
  • indexing and search support tools, and data harvest management tools
  • Command line API but there is also a large and growing set of graphical form interfaces and you can start the toolbox without even using the command line

Data management framework


  • Data management cycle - designed to help an LTER site do all of its data management tasks
  • Data and metadata can be imported into the framework and a very mature set of predefined import filters exist: csv, space- and tab-delimited and generic parsers. Also, specialized parsers are available for Sea-Bird CTD, sondes, Campbell, Hobo, Schlumberger, OSIL, etc.
  • Live connections i.e. Data Turbine, ClimDB, SQL DB's, access to the MATLAB data toolbox
  • Can import data from NWIS, NOAA, NCDC, etc.
  • Can set evaluation rules, conditions, evaluations, etc.
  • Automated QC on import but can do interactive analysis and revision
  • All steps are automatically documented, so you can generate an anomalies report by variable and date range which lets you communicate more to the users of the data

Recorded Session Play | Download

9/23/2014

  • Fox Peterson (Andrews LTER) reported on QA/QC methods they are applying to historic climate records (~13 million data points for each of 6 sites).

The challenge was that most automated approaches still produced too many flagged data that needed to be manually checked. Multiple statistical methods were tested based on long-term historical data. The method they selected was to use a moving window of data from the same hour over 30 days and test for 4 standard deviations in that window; E.g., use all data for 1 pm for days 30 - 60 of the year, compute four standard deviations, and set the range for the midpoint day (45) at the 1pm hour to that range.

  • Josh Cole reported on his system, which is in development and he will be able to share scripts with the group.
  • Brief discussion of displaying results using web tools.
  • Great Basin site discussed the variability in their data, which "has no normal"-- how could we perform qa/qc based on statistics and ranges in this case?
  • Discussion of bringing Wade Sheldon to call next time / usefulness of the toolbox for data managers
  • Discussion of using Pandas package- does anyone have experience, can we get them on?
  • Discussion of the trade off between large data stores, computational strength, and power. Good solutions?
  • ESIP email had some student opportunities which may be of interest
  • Overall, it was considered helpful if people were willing to share scripts. Discussion of a GIT repository for the group, or possibly just use the Wiki.

Recorded Session: Play | Download

8/26/2014

Suggestions for future discussion topics

  • Citizen Science contributions to environmental monitoring
  • 'open' sensors - non-commercial sensors made in-house, technology, use, best practices
  • Latest sensor technologies
  • Efficient data processing approaches
  • Online data visualizations
  • New collaborations to develop new algorithms for better data processing
  • Sensor system management tools (communicating field events and associating them with data)

Recorded session: Play | Download