Solta 2011 Agenda
Monday 16:00-18:00: Data Catalog Side Meeting ||| Discussion
GI-cat - ESSI Federated Catalog (aka EuroGEOSS Discovery Broker, part of GEOSS Common Infrastructure)
AQComCat - Air Quality Community Catalog
GI-cat-AQComCat link - Federating AQComCat into GI-cat; Accessing GI-cat from AQComCat
Monday 18:00-19:00 Social
Tue 08.00-10.00: Self-Introduction, 5 mins/participant
Agenda and Procedures
- Daily 8:00-12:30; 16:00-19:00; Coffee break and 18:-19:00 (1.5 h) informal interaction
- Total of ten two-hour sessions.
- First half focused on data server, catalog software
- Second half on more general AQ data networking
- Two rapporteurs for each session
Self-Introduction - by each participant
- Slides 1-2: Name, institution, relevant research (on interoperability, networking), participation in major projects/programs
- Slide(s) 3-(4): What would you like to take away from the workshop; what would you like to offer to the workshop (or AQ CoP)
Tue 10.30-12.30: Introduction of Hubs, 5 min each
Intro of AQ Data Hubs - by their representatives
- AQ Community Servers: Common DataFed, FJ Juelich, NGC/CIERA, EBAS
- Other Servers: DLR/ACP, AIRNow, EEA NRT, AQMEII (AeroCom, RSIG, GIOVANNI)
Air Quality Community of Practice (AQ CoP) - R Husar
Air Quality Community Data Server - M. Schultz
Tue 16.00-18.00: IT Breakout: AQ Community server software
Use of netCDF and other data formats
Gridded data service through WCS
Station-point data service (SQL)
Data server performance issues/solutions; Server co-development tools
Relationship to other WCS servers; Real Data-to-WCS/WFS/WMS-Mapping
Tue 16.00-18.00: Scope and Type of Data to be Served
- Ensemble System (JRC) Presentation from Stefano Galmarini on AQMEII and ENSEMBLE (Data Hub/Facilitator)
- Witten in Perl and IDL
- Comparisons and evaluation of models
- Coordinated model harmonization across 27 models
- Also a reposition for the dissemination of model and AQ measurements
- Produced 4-page tech spec docs that were sent to groups for supplying data; this was key step in making the process smother.
- Main features
- Transfer of very large model output across internet
- Storage of 1,2,3D model data
- Quick access of large dataset
- Distribution of KML and WMS (in progress)
- Use an ASCII format file to describe the model files
- Have a program call ENFORM (Fortran) that transforms model data into the compressed dataset (1.2GB → 200MB)
- Can produce GE projected files (nice)
- 11 papers came out in a special journal edition
- Data Versioning.
AVik: He suggested that timestamp of submission of data is a useful versioning approach.
- Need additional flags on observation level
- Latest time stamp requires to data ...
- Version dates could confuse broad range of users
- Martin - Applications would use the version date, not user
- Who are the next users in this chain?
- GEO AQ CoP.
- RHusar: What is the AQ Community of Practice is/does? CoP Intro PPT
- Pic of data pool - started discussion with the idea that CoP should only create distributed data pool and build the data network.
- Discussion w/in group thought this was too narrow and had issues with how the CoP differed from the facilitators.
- Reworded CoP purpose to CoP should "Connect and Enable" AQ data networks, hubs and facilitators so that they can connect and enable the data needed for their systems.
Data Providers: Existing data Data Hubs - how do they work?
- by data source-driver (mandated, research)
- by content/platform (emission, ambient, remsens, model)
- by space-time (global, regional)
- Primary (original), Secondary, Mediated
- Raw , processed , how?
Tue 18.00-19.00: General Discussion, with wine and cheese
Wed 8.00-10.00: Breakout reports, general server items
What few things must be the same, so that everything else can be different?
Report from the IT breakout session: Community server software
Report from non IT breakout session: ADN scope, providers, users
Report from the pre-workshop Data Catalog side meeting
Crossover topics between IT and non-IT issues
Wed 10.30-12.30: Data network catalog and clients
Functionality of an Air Quality Data Network Catalog Who , what, where, when
Catalog content and structure (granularity) of ADNC?
Minimal metadata for discovery, data provenance, quality, access constrains?
Single AQ Catalog? Distributed? Service-oriented?, Access rights
- Ben: Similarity with hydrology
- Stefano: GI-cat - help on CSW implementation
- Paul: Catalog has to be linked to the data offering services
Wed 16.00-18.00: Relationship, cooperation, governance
3 call-in presentations, (10 mins each max!)
- 16:00 US EPA - Terry Keating (HTAP, CyAir..) ...perspective on AQ data networking;
- The contribution from network is mostly to the hemispherical or global scale air pollution but at the same time benefits to the local by specifying what is the contribution of global emissions to the regional AQ.
- HTAP needs AQ network. In that sense it is a client of this network and facilitates the collaboration in science.
- If you connect the data systems, using the same channel we can connect scientists/analysts globally
- AQ Network needs HTAP because it provides focus and application and demonstrate value and creates a demand for investment. Without demonstrating the values in concrete terms it will not be easy to get funding in order to build the infrastructure.
- Cyber infrastructure focused on the data that are needed and used by EPA. But it should be linked to global system. so far it is in planning mode how to move to inter operable system. It is looking for broader community such as CoP to define standard practices.
- 16:10 Nat.Park Serv. - Bret Schichtel (VIEWS) on data collection & usage in VIEWS DSS -
- 16:20 GEO - UIC. Adam Carpenter on GEO Earth Observation Priorities
- What kinds of end users does your organization represent and/or interact with?
- What are their specific precipitation data needs? What is that data needed for?
- Can you provide us documentation?
- Do you have other feedback or comments?
M. Schultz: Idea of a case study where we demonstrate how the machinery works with the network and in the absence of network.
Tim Dye : How do we bring in other new comers?pilot projects?
R Husar: Pilots are OK..but if at all possible the pilots are to be driven by user need rather than doing it for just the sake of demonstration.
B Schichtel: What can the CoP contributed network offer to me for AQ analysis?
S. Galmarini: Hypothesizing what possible benefits the networks can have is not very productive because it will depend on those who are participating. Its only after the connections are made, one can observe as to what benefit this network actually created.
T. keating:I think this is exactly the stage that we are at from HTAP perspective. The dream that we had is that we could create a distributed network of modelling information, observational information of various types and emissions information and then be able to compare through web browser based tools be able to access those distributed databases and do analysis comparing models to models , models to observations, models to emissions, emissions to observations etc. I think we are only at a stage now were we develop robustness in connections between some of those distributed databases that now we can begin to build tools to do some of those analyses. I think that the sort of thing that Bret is after is sort of the system we want to be able to develop, to be able to go and access those distributed databases and answer questions. We need to develop analytical applications that can do that.
B Schichtel: Access to data for emissions, models and observations and possibly tools would also be beneficial to my project in the US national park service.
R Husar: Interoperability has to take place at organization level,data level...(show image)
M.Schultz: Such a collaboration between MACC program and EPA not at program level but also at data sharing level. 10 or 20 years before this would not have been possible.
B.Dominico: I am puzzled what the issues. As I recall, DataFed was available 5 years ago. Community are not using the data?
R Husar: The problem is the slowlessness or the pace at which the AQ networking is evolving... Over the past 5 years businesses have adopted social networking but there was little change in the AQ networking. My feeling is that impediments are not technological. But what are they? Unless we understand the actual impediments, this AQ CoP effort will be just "another wreck along the road" toward interoperable systems. we will address these impediments during the thursday morning session.
B.Dominico: Now I understand some of the issues and also notice that this meeting we have representatives of data systems and technologies but the users of AQ data are not well represented here. 25 years ago when unidata was formed it was the users that got together and declared that some body has to help accessing the data. Driving force is the user community.
P. Kjeld : Yesterday I learned that this community has chosen WCS protocol for delivering AQ data. We will then implement a WCS server to share the EEA datasets. However, EEA serves many communities and some required very different data delivery. So which delivery procedure we should focus on?
M. Schultz: This is why we need compelling demonstration of the connected system. For instance in the DataFed viewer and now at Julich you can take data from multiple sources and overlay those as if they were part of single data system. Benefits of switching to new systems to the operational people are not apparent.
T.Keating: Martin is right. In particular education about the approach of sharing is important. When we talk to people, who are running different data systems their response is that "we are sharing the data". But their concept data sharing is that its downloadable. We do have the technologies but lot of people don't know about it and the approaches by which the web service technologies can be used. Community of practice could do educating. Generate the educational materials.
R. Husar: Stefano Nativi has been a major driver force for introducing informatics as a connector and integrator for earth science activities in Europe. Stefano, what is your perspective?
S. Nativi: I listened to your discussion with considerable interest similar dialogue on why to share and how to share data has been going on bio diversity community with quite successful out comes. They prepared documents, a work plan that outline the activities toward data sharing and integration. It was driven by science. Another contribution of the community is to collect best practices.
R Husar: What you think makes the bio diversity group work so well?
R. Husar: Following the comments of Terry and Stefano, this is a reminder to our software developer subgroup that identifying and describing their best practices would be highly desirable.
B.Dominico: The unidata user community made up primarily from meteorologist in academia and other researchers has been interested in accessing real time air quality data. I have been asked to facilitate access to AQ data so that it could be distributed through Unidata network. I have been approaching EPA and others but my efforts were unsuccessful. I just want to emphasis that there is a community of atmospheric researchers that asks for AQ data.
T. Dye: The need for AQ data may not be that Huge. One of the challenges is that when we install something we have to think about how to run it for the next 10 years. It may be easy for Unidata but not for us. The other point is that we did not quite understand how unidata network operates.
U.Shankar: Here is a statement from the CMAS perspective. It is a rather complicated AQ modelling system. Infrastructure that supports its success typically depend upon community participation that had a major component of education and training.
P.Kjeld: We have a production environment and we have to make sure that it is not disturbed with anything we do. So we need to separate the data systems for which we need to allocate people and resources. Also, we are not using all open source software but rather commercial, licensed software that is properly maintained so we try to do little development as possible.
P.Eckhardt: In response to peter, at NILU our prototype WCS server is running on a separate system. Through replication of the operational database into prototype we can shield the operational system from negative impacts.
Stefano: I think we are trying to solve too bigger problem by seeking a universal solution toward the interoperability. What we should do instead of this philosophical discussion is just decide do we want to be in or out from this activity?. If you are in, keep your constraints ( time, money, personal) in mind and work together to solve the problem. with regards to education, the best way to educate is through example. If this community can demonstrate the benefits of the network system others will jump on it.
17:00 - 18:00 User perspective, value chain.
Relationships, cross-thematic links (EGIDA, ESIP), How can we collaborate? Network governance
Wed 18.00-19.00: General Discussion, with wine and cheese
Thu 8.00-10.00: Networking impediments and opportunities
JJ Bogardi, Global Water System Project (GWSP) at EGIDA, Bonn:
Nature of Networking Projects: Complex funding; mixture of paid and voluntary; multiple obligations; international; differing project maturity; governance, cultures
Which glue keeps it together? Trust and personal affinity. Common objectives and scientific values. Mutual respect. Mutual benefit (win-win). Complementarity. Donor dictate
Lethal“ ingredients. Turf mentality. Budget discrepancies. Too much competition. Lack of data and information exchange. Donor jealousy
- Clear statements about obstacles: no clear structure; no dedicated funding and no clear idea(s) yet how to do it.
- Opportunities, fixes: ID manageable work packages; Find, organize, distribute reusable components, resources; Create win-win situation
Thu 10.30-12.30: AQ Community Metadata Discussion
- Target User communities
- Use cases for different applications... from scientists to managers, media people and the general public)?
- How can users find out about (each) system? Big future issues: data quality, traceability, metadata..
Thu 16.00-18.00: Workshop outputs, outcomes, plans?
What are the anticipated outputs? Agreement on community WCS server for grid and point data; server governance, distributed catalog; workshop summary
What are the anticipated outcomes? More servers and data added to the shared data pool and more willing users of shared data.