Desired Characteristics of Air Quality Data Systems

From Earth Science Information Partners (ESIP)

Back to < Community Air Quality Data Systems Strategy

Desired Characteristics of Air Quality Data Systems

Chapter text here.


Comments from Data Summit Workshop

Group A2: Enabling Data Access

Data system capabilities. These five information capabilities reflect the breakout’s sense of the capabilities that are foundational to enabling data access that either don’t exist or can be improved upon. An improved ‘system’ would:

  1. Making access to data the standard expectation for government funded data;
  2. Increase access to orphan or gray data ;
  3. Include new classes of data such as model outputs;
  4. Provide easy access to relevant archived/versioned data;
  5. Be robust and stable to so that application developers and their funders would be confident in relying on such a system, even though they do not own it; and
  6. Be consistently described

Principles:

  • The system would assume that information is always accessible when data owners define appropriate, and ownership and stewardship responsibilities are established. This would include versioned and archived information.
  • The system will enable user feedback in all parts of Data Value Chain and be consistently described.
  • The system and its participants will adhere to a community defined and governed suite of standards/conventions.

Community Actions:

  • clarify ownership and stewardship responsibilities,
  • must assure that when data is accessed credit is received when credit is due, ,
  • work as a community to define the expectation of data accessibility by type of data.
  • establishing a standard mechanisms for data discovery and description
  • identifying what in the system can and should be standardized.
  • Learn from the experience of partner who have been working these areas ahead of us.

Group B2: Data Processing and Integration

Data system capabilities. Group B2 discussed data processing and integration. Group B2 identified three priority information capabilities:

  1. Providing value-added functionality including filtering, aggregation, transformation, fusion, and Pre and Post processing QA/QC
  2. Enabling Feedback on version changes, measures on performance from community and automated value added services (see above), and communication of assessments for suitability for use.
  3. Establishing standard protocols for data and service access. Starting point for these standards are those adopted by the OpenGIS and GEOSS, including: Web Coverage Services (WCS); Web Feature Services (WFS); Web Map Service (WMS)

Principles:

  • Provide adequate information to end-user application developers to allow appropriate development;
  • Provide knowledge and insight in post processing and analysis;
  • Accommodate Metadata demands;
  • Have reliable and robust value added services; and
  • Be stably governed including service level agreements where necessary.

Community Actions:

  • Inreach and outreach. Inreach by increasing visibility within partner agencies, specifically mentioned was EPA. Outreach would include providing ‘layman’ explanations and working to relate/link the efforts of this community to parallel groups and projects.
  • importance of continuing this process including implementing a governance mechanism to convene the community in a fashion that allows decision making.
  • Identification and execution of a pilot, candidates included remounting capabilities of the Health Effects Institute or establishing a CAP for an specific AQ event class

C2: Visualization / Analysis

Data system capabilities.

  • Multiplatform support (flexibility)—ability for user to use with many systems
  • Standards-based spatial and temporal capability
  • Ability to aggregate by user, including multiscale
  • Display and communication of uncertainty
  • Sufficient description/annotations of visualization (e.g., units, scales
  • Capabilities for interpolation (spatially & temporally)
  • Multiple options for visualization (e.g., multiple ways for interpolation, including no interpolation)
  • Recommended default with other options for more advanced users
  • Tiered capabilities
  • Consistent Color
  • Ability to interface/export/translate between standard systems (e.g., Google)
  • Ability to discover
  • Means to support a defined need or user group
  • Include multiple data types & forms – ground, satellite, models and able to work with multiple scales
  • Multiple Output Forms (e.g., PNG, MPEG, KML/KMZ, Excel, PPT)

Principles:

  • Diversity of tools and applications. This is important to assure a robust system; it appeals to specific users, and provides flexibility.
  • to provide flexibility, the system should place a premium on modularity and portable development.
  • visualization and data provided to end user should have scientific credibility, including information on levels of quality and a document change of custody and data source.
  • to help end users, visualization & analysis products must include interpretation and help files. This is, in part, accomplished by assuring that the ‘system’ knows its users and how they want the information. This type of interface with users could help with defining user requirements, building community, enabling feedback (up and down stream).
  • visualized at appropriate spatial and temporal scales for analysis.

Community Actions:

  • Owners of systems should make services available outside firewall and put WMS/WCS standards in place (e.g., datafed, GIOVANNI).
  • Tool providers need to ask for & include lineage of data (be explicit in presentation). Existing systems should make their info available as web services.
  • EPA, NOAA, NASA, CDC should develop interagency agreement to define commitment to the process.
  • community must convene to develop guidance for metadata for end users (i.e., idiots guide for metadata”).
  • community must continue to engage end users through outreach, pilots, requirements development, and by linking end/users back to data generators. As part of this outreach, the community should consider an end-user panel/advisory group.
  • To energize the community and especially the federal contingency, partners should enact/reenergize/update interagency MOUs and begin briefings within agencies including successful examples/models and an inventory/evaluation of existing systems.

Group Discussion (Top Priorities)

Following the breakout sessions and the report out from the breakout sessions, the group identified common themes and the top priority areas for the community to consider. The participants were each asked to identify up to five themes and priority areas. Participant responses are summarized below.

Zoho Spreadsheet

Metadata

The need and use of metadata was a predominant theme in the breakout groups and during plenary discussion. The group identified that there are two types of metadata that are important to the community, innate and emergent. Innate metadata is provided by the data provider, needs to be standardized for the entire community, and the system must propagate it (supply it when demanded unchanged). Emergent metadata is user-generated information and could be opinions. The group agreed that whatever the community ultimately decides, the metadata requirement should leverage what partners have already implemented, should be facilitated by metadata generation tools, should not be unduly burdensome, and not a huge monster. The meeting participants identified that a next step would be investigating and documenting existing efforts in partner federal agencies, specifically mentioned was EPA.

Standards

It was clear that a next and important step for the community is to establish a small set of data and technical standards that would enable consistent implementations and ‘blue line’ development/enhancement. The group consensus was that it was important to do this as soon as possible as having something, even if not a comprehensive list, is better than nothing. The meeting participants identified that an appropriate next step might be to establish and convene a small group of scientific and IT individuals to evaluate data standards, data formats, and technical standards. This group would leverage partner experience, i.e., NOAA, EPA, WMO, and prepare a recommendation for the community to consider.

End User Requirements Development

The summit participants felt that a concerted, increased, and focused effort to include end users in requirements development would be substantially beneficial to the AQ data community. This could take several different paths including hosting meetings between system developers and end users, establishing user groups with connections to application developers, and a mechanism to include end users in evaluation of over system performance. This would be a small part of a larger effort to include the end user in more and more of the AQ Data communities’ work. The group identified that a next step could be to work with the AQ data community to identify end users and pilot one joint requirement development effort such as the Health Effects Institute remount described above.

Feedback

Every technical breakout group identified the importance of feedback, albeit each with a different flavor. The feedback the AQ community is seeking is person-to-person, agency-to-agency, and system-to-system across multiple subjects including feedback on data (usage, quality, gaps), models (performance, results, gaps, usability), and systems (performance, development, enhancements, gaps, issues). Each of these relationships for feedback imply a different solution and a different set of roles and responsibilities. Consensus among the group was that this was extremely important but must be thought out and formalized. The group spent some time talking about the efficacy of the Wiki as a tool to enable feedback and concluded that more information was needed.

Related Links