EnviroSensing Monthly telecons
back to EnviroSensing Cluster main page
Telecons are on the fourth Tuesday of every month at 4:00pm ET. Click on 'join'
- December 23, 2014, 4:00 PM EST. No telecon - Happy Holidays
- January 27,2015, 4:00 PM EST. Rick Susfalk, Division of Hydrologic Sciences, Desert Research Institute
Notes and recordings from past telecons
To watch the recordings make sure you allow popups for https://esipfed.webex.com
Rick Susfalk from the Desert Research Institute presented about Acuity Data Portal. Notes from meeting taken by Fox Peterson, please edit as you see fit: We had in attendance 8 persons.
Acuity Portal System
- Started in 2006
- originally VDV data solution
- improvements to web-interface; sits on top of VDV as Acuity server
- Acuity is a continuous monitoring of key client-driven data
- it includes sensors and data logging deployment and maintenance, telemetry, data storage and analysis, automated airing, web portal for data access
- individualized web presence tailored to client needs
- not a single tool, but instead integrates commercial, open source, and proprietary hardware and tools
- customizable project specific descriptions
- common tools used to provide rapid, cost-effective deployment of individualized portals
- physical infrastructure is shared amongst smaller clients for cost-saving or it can be segregated for larger clients.
- access is controlled down to the variable level-- "we can define who gets to see what"-- for example, public can not see some features
- one view could be "pre-defined graphs" without logging in, but if you want to download the data you must log in at your permissions level
- priority, very important
- DOD certification and accredidation (possible link below)
- have hired a security professional
- https:// everywhere for good protocol
- customized thresholding and data-freshness
- trending alerts, for example, know if battery will go bad
- stochastic and numerical modeling
- scoring incoming data for QA/QC processing
- web-based GUI
- users and managers can create, edit, and modify alerts online
- groups can be created so that you can schedule management and alerts
- also offers localized redundant alerting
- two-way communication with the campbell data loggers (cr1000)
- more about getting the data to the data managers for more in-depth QA/QC than about providing that part of the tooling
Data graphing features
- Pre-determined graphs for basic users
- Data selector for more advanced users
- "we don't know what the users want to see so we give them the tools to do it" (good idea!)
- anything you can change in Excel you can change in their graphs on the website -
- vista data vision (http://www.vistadatavision.com/)
- another vdv link: (http://www.vistadatavision.com/features/responsive/)
- relates your parameters to the network and what other sensors are doing
- current system is getting more flexible
- metadata is still largely user responsibility
Flight plan - safety tool
- field personnel are data
- users put in the travel time for safety
- buddy system, alerted right before you return, then calls boss etc. Many levels hierarchy
- Portals that are monitoring things
- ability for data refreshment
- colors for indication, ex., data would not be gray if there was lots of new data
- users can change the settings on the data logger
- scrolling, scaling, plotting, etc. via interaction with the user
- can save your own graphs
graphs and alerts
- many parameters
- you can save!
- email, sms, phone
- default settings for users
- lots of personnel management tools in this in general
- cross-station "truly alarm or not" if station1 has a value but station2 has a different one, don't alarm sorts of rules
- lists/user groups appear to be very important with this tool
- sensor and triggers: customize one or more parameters that you are bringing into your database
real-time updates on loggers
- ex. 10 minute data, user comes in and makes a change, the information is saved to the database and then is presented to all other users
- the person will request a change and say what that change is
- when there are different levels of connectivity ie. analog phone modems, before the data has the chance to work its way back into the system there is a lot of validation being done
- everglades heat pulse flow meters
- uses google maps
- extends beyond the vdv, more than 1 .dat file
- integrate multiple .dat into many tables
- managed by the data managers at DRI
- workflow :
logger net --> vdv --> acuity, ok, let's give access to the DM for all these variables, click on it ok now the manager can see it --> generate an excel file with tables for all this metadata --> enter the data into the excel files--> send back to acuity --> injests, runs queries, back to db-- > metadata in bulk, quickly
- we asked if the system ends before the qa/qc process begins, answer: qa/qc is done at the DRI, near real time QA though
- direct the managers to the future data problems
- manual decision making
Scotty asked about the duration (long and short term) of projects and how affects funding. most is funded by long-term projects; this is why they do the stats and numerical methods in the future
Amber asked about pricing; pricing is by hour to get up, then a price for maintaining the system for the duration of the project 5-10 .dat files, only 8 hours of person time at DRI to make a portal
Jordan Read presented the SensorQC R package
Wade Sheldon presented the GCE Data Toolbox – a short summary follows:
- Community-oriented environmental software package
- Lightweight, portable, file-based data management system implemented in MATLAB
- generalized technical analysis framework, useful for automatic processing, and it's a good compromise using either programmed-in or file-based operations
- Generalized tabular data model
- Metadata, data, robust API, GUI library, support files, MATLAB databases
- Benefits and costs: platform independent, sharing both code and data seamlessly across the systems, version independent as far as MATLAB goes, and is now "free and open source" software. There is a growing community of users in LTER.
Toolbox data model
- Data model is meant to be a self-describing environmental data set-- the metadata is associated with the data, create date and edit date and such are maintained, and its lineage.
- Quality control criteria- can apply custom function or one already in the toolbox
- Data arrays, corresponding arrays of qualifier flags -- similar to a relational database table but with more associated metadata
Toolbox function library
- The software library is referred to as a "toolbox"
- a growing level of analytical functions, transformations, aggregation tools
- GUI functions to simplify the usage
- indexing and search support tools, and data harvest management tools
- Command line API but there is also a large and growing set of graphical form interfaces and you can start the toolbox without even using the command line
Data management framework
- Data management cycle - designed to help an LTER site do all of its data management tasks
- Data and metadata can be imported into the framework and a very mature set of predefined import filters exist: csv, space- and tab-delimited and generic parsers. Also, specialized parsers are available for Sea-Bird CTD, sondes, Campbell, Hobo, Schlumberger, OSIL, etc.
- Live connections i.e. Data Turbine, ClimDB, SQL DB's, access to the MATLAB data toolbox
- Can import data from NWIS, NOAA, NCDC, etc.
- Can set evaluation rules, conditions, evaluations, etc.
- Automated QC on import but can do interactive analysis and revision
- All steps are automatically documented, so you can generate an anomalies report by variable and date range which lets you communicate more to the users of the data
- Fox Peterson (Andrews LTER) reported on QA/QC methods they are applying to historic climate records (~13 million data points for each of 6 sites).
The challenge was that most automated approaches still produced too many flagged data that needed to be manually checked. Multiple statistical methods were tested based on long-term historical data. The method they selected was to use a moving window of data from the same hour over 30 days and test for 4 standard deviations in that window; E.g., use all data for 1 pm for days 30 - 60 of the year, compute four standard deviations, and set the range for the midpoint day (45) at the 1pm hour to that range.
- Josh Cole reported on his system, which is in development and he will be able to share scripts with the group.
- Brief discussion of displaying results using web tools.
- Great Basin site discussed the variability in their data, which "has no normal"-- how could we perform qa/qc based on statistics and ranges in this case?
- Discussion of bringing Wade Sheldon to call next time / usefulness of the toolbox for data managers
- Discussion of using Pandas package- does anyone have experience, can we get them on?
- Discussion of the trade off between large data stores, computational strength, and power. Good solutions?
- ESIP email had some student opportunities which may be of interest
- Overall, it was considered helpful if people were willing to share scripts. Discussion of a GIT repository for the group, or possibly just use the Wiki.
Suggestions for future discussion topics
- Citizen Science contributions to environmental monitoring
- 'open' sensors - non-commercial sensors made in-house, technology, use, best practices
- Latest sensor technologies
- Efficient data processing approaches
- Online data visualizations
- New collaborations to develop new algorithms for better data processing
- Sensor system management tools (communicating field events and associating them with data)