Sensor Data Management Middleware

From Earth Science Information Partners (ESIP)
Revision as of 16:55, March 24, 2014 by Christine.laney (talk | contribs)

back to EnviroSensing Cluster main page

Fig 1. Middleware

Overview

Middleware are software packages and procedures that reside virtually between data collectors, such as automated sensors, and data ‘consumers’, such as data repositories, websites, or other software applications. Middleware can be used to perform tasks such as streaming data from data loggers to servers, archiving data, analyzing data, or generating visualizations.

Many middleware packages are available for developing a comprehensive, reliable, and cost-effective environmental information management system. Each middleware option can have a unique set of requirements or capabilities, and costs can vary widely. A single middleware package may be used if it includes all of the user requirements, or multiple middleware may be bundled into a data management system if they are compatible or interoperable with each other and the rest of the data collection and management system.

This section describes multiple middleware packages that are currently available, and provides examples of how different software and procedures are being used to collect, analyze, visualize, and disseminate sensor-supplied environmental data.


Introduction

There are multiple factors that may affect the choice, use, and performance of middleware. These factors may be classified according to a group’s research agenda, technological requirements, and personnel skill sets.

Research Agenda: The research agenda of a group is a major determinant of the type of middleware system needed. A group focused on only one or a few narrowly focused research questions may need fewer types of sensors and consequently, fewer software modules may be adequate to streamline data processing from collection to the end goal. A team that investigates multiple questions spanning multiple research domains is likely to use more diverse and/or larger sets of sensors. There may not be a single middleware package that can meet all of the needs of a research group. In this case, multiple packages will need to be linked into a workflow.
Technological Requirements: The technological requirements of a research program may vary from simple to complex. If the research can be done with sensors from a single, well-managed company, the proprietary software packaged with the purchased sensor network may be adequate for at least a major portion of the information management system. For example, for Campbell Scientific dataloggers, their “LoggerNet” software integrates communication, data download, display and graphics functions. However, some dataloggers and sensors (particularly innovative ones, custom-built), may need custom-written software. It is important to plan time and budgets for required software upgrades, licensing, additional packages, support, and maintenance. Systems that cost less in the outset may not always be cheaper over the long run. It is also important to consider how to best meet infrastructure and bandwidth requirements, while deploying middleware on a variety of servers or laptop computers in the field or lab setting. Depending on the data and hardware infrastructure characteristics, each middleware option can introduce benefits or drawbacks to the overall system functionality.
Personnel Skills: Another key factor to consider is the skill set of the personnel. A complex data management system may require multiple people, each with a unique skill set such as database design, system architecture, web programming, etc. It is important to correctly identify each person’s skill set and role in data management tasks. It may also be necessary to plan for additional hires or job-training to addresses various scenarios and solutions, to identify appropriate salaries, and to budget enough time for software development and system administration. More details about the personnel roles and skills can be found in the “Roles and required skill sets“ section.


Methods

Middleware can be classified with respect to the functionality they provide, such as:

Controlling instrumentation and data collection: Modules may be used to control sampling intervals, manage the event-triggered (burst) or continuous sampling regimes, communicate and transfer data between the instrumentation and other system components.
Data monitoring, processing, and analysis: Modules may provide alarm management, perform automated QA/QC on data streams, or run derivative calculations including averages, aggregation and accumulation, data shifting and transformation, filtering of time series records with respect to the dates, value range, location, station/variable type, or other criteria.
Export and publishing of data: Modules may provide functionality to export sensor data to different formats (e.g., ASCII, binary, or xml), different archives, make data discoverable through geospatial catalogues, or publish the data through web services.
Data visualization: Modules may provide visualization (e.g., tables, graphs, sonograms) of geospatial and/or time series data from sensor arrays or workflow structures.
Documentation: Modules may be used to document field events through paperless collection of field data, integrate sensor data and documentation (see sensor tracking & documentation section), or handle sensor calibration records.
Other supported functionality: Modules may be used to provide access to external data (e.g., ODBC, JDBC, OLE DB), to connect or chain other middleware components, or to implement mobile applications.