Difference between revisions of "Solta 2011 Agenda"

From Earth Science Information Partners (ESIP)
(Reverted edits by 207.46.13.145 (talk) to last revision by Martin Schultz (MartinSchultz))
 
(155 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 
<noinclude>{{AQ CoP Solta2011 Backlinks}}</noinclude>
 
<noinclude>{{AQ CoP Solta2011 Backlinks}}</noinclude>
  
'''Monday Evening:''' Registration and Social 
+
== Monday 16:00-18:00: [[Data Catalog Side Meeting]]  ||| [[Talk:Data_Catalog_Side_Meeting|Discussion]] ==
 +
<center>'''Location: LeoMar, Boldrini/Bigagli'''</center>
 +
'''GI-cat''' - ESSI Federated Catalog (aka EuroGEOSS Discovery Broker, part of GEOSS Common Infrastructure)<br>
 +
'''AQComCat''' - Air Quality Community Catalog<br>
 +
'''GI-cat-AQComCat link''' - Federating AQComCat into GI-cat; Accessing GI-cat from AQComCat<br>
  
==Tue 8.00-10.00: Self-Introduction, 5 mins/participant==
+
== Monday 18:00-19:00 Social ==
In order to make efficient use of our time in Croatia, we ask you all to prepare for the workshop in the following ways (aside from arranging your travel etc.):
+
==Tue 08.00-10.00: Self-Introduction, 5 mins/participant==
*Slides 1-2: Name, Institution, Relevant research, development or organizational work on AQ data system interoperability and networking
+
<center>'''Location: LeoMar, Husar/Vidic'''</center>
*Slide(s) 3-(4): Involvement and participation in projects, programs, i.e. list of Integrating Initiatives. Potential contributions.
+
'''Welcome'''<br>
 +
'''Logistics'''<br>
 +
'''Agenda and Procedures'''<br>
 +
* Daily 8:00-12:30; 16:00-19:00; Coffee break and 18:-19:00 (1.5 h) informal interaction
 +
* Total of ten two-hour sessions.
 +
** First half focused on data server, catalog software
 +
** Second half on more general AQ data networking
 +
** Two rapporteurs  for each session 
 +
----
 +
'''Self-Introduction''' - by each participant
 +
*Slides 1-2: Name, institution, relevant research (on interoperability, networking), participation in major projects/programs
 +
*Slide(s) 3-(4): What would you like to take away from the workshop; what would you like to offer to the workshop (or AQ CoP)
 +
 
 +
==Tue  10.30-12.30: Introduction of Hubs, 5 min each==
 +
<center>'''Location: LeoMar, Schultz/Bernonville'''</center>
 +
'''Intro of AQ Data Hubs''' - by their representatives<br>
 +
* AQ Community Servers: Common  DataFed, FJ Juelich, NGC/CIERA,  EBAS
 +
* Other Servers: DLR/ACP, AIRNow, EEA NRT, AQMEII  (AeroCom, RSIG, GIOVANNI)<br>
 +
'''Air Quality Community of Practice (AQ CoP)''' - R Husar<br>
 +
'''Air Quality Community Data Server''' - M. Schultz<br>
  
==Tue  10.30-12.30: Hubs, servers, ADN, CoP==
+
==Tue  16.00-18.00: IT Breakout: [[AQ Community server software]]==
This session will be a status report on the current state of data accessibility and networking.<br>
+
<center>'''Location: LeoMar, Decker/Hoijarvi'''</center>
*Short description of major data hubs<br>
+
'''Use of netCDF and other data formats'''<br>
*current state of AQ data networking with the standards based community based server software <br>
+
'''Gridded data service through WCS'''<br>
*Role of GEO AQ CoP<br>
+
'''Station-point data service (SQL)'''<br>
 +
'''Data server performance issues/solutions; Server co-development tools'''<br>  
 +
'''Relationship to other WCS servers; Real Data-to-WCS/WFS/WMS-Mapping'''<br>
 +
[[Talk:Candidate_Technical_Topics|Notes]]
  
==Tue  16.00-18.00: IT: Community server software==
+
==Tue  16.00-18.00: Scope and Type of Data to be Served==
This sessions is standards and conventions | Implementation for gridded and station data | Development tools | Server performance<br>
+
<center>'''Location: LeoMar, Vik/Fialkowski'''</center> <br>
  
'''Data Servers: Technical Realization (IT) Issues and Solutions'''<br>
+
*  '''Ensemble System''' (JRC) Presentation from Stefano Galmarini on AQMEII and ENSEMBLE (Data Hub/Facilitator)
* WMS for display. Issues? Atm. Composition Portal contribution
+
** Witten in Perl and IDL
* WFS .. for station description 
+
** Comparisons and evaluation of models
* WCS data encoding
+
** Coordinated model harmonization across 27 models
** Data structure hierarchy: DataHub; Service; Coverage: Field; Flag
+
** Also a reposition for the dissemination of model and AQ measurements
** WCS 1.1: Service->Group of similar datasets; Coverage->>Dataset; Filed->Parameter; Flag->Flag
+
** Produced 4-page tech spec docs that were sent to groups for supplying data; this was key step in making the process smother.
** Combination of W*S services: WCS->Access data; WFS->Access spatial metadata; WMS->Display spatial data
+
** Main features
'''Real Data-to-WCS-Mapping tructure'''<br>
+
*** Transfer of very large model output across internet
* Data hub that exposes the data  ==> Provider    ==>  WCS Service 
+
*** Storage of 1,2,3D model data
* Observation platform or network ==> Dataset    ==>  WCS Coverage
+
*** Quick access of large dataset
* Observation parameter/variable ==> Parameter ==> WCS Field
+
*** Distribution of KML and WMS (in progress)
'''Issues re. the use of netCDF and other data formats'''<br>
+
** Use an ASCII format file to describe the model files
netCDF is standard format for multi-dimensional data. Cf-netCDF is used both as an archival format of grid data as well as a payload format for WCS queries.
+
** Have a program call ENFORM (Fortran) that transforms model data into the compressed dataset (1.2GB → 200MB)
* Issue: ambiguity of CF
+
** Can produce GE projected files (nice)
* Issue: We should define a standard python interface (PyNIO, python-netcdf4, scipy.io.netcdf?)
+
** 11 papers came out in a special journal edition
* Issue: Delivery of (small) data sets in ASCII/csv format
 
* Issue: Reading of grib data (?)
 
'''Use of WMS, WCS, WFS .. in combination?'''<br>
 
Data display/preview is through WMS. AQ data can be delivered through WCS, WFS. In AComServ, WCS for transferring ndim grid and point-station data; WFS for deliver monitoring station descriptions.
 
* Issue: WMS interface for preview; "latest" token for dynamic links?
 
  
'''WCS versions'''<br>
+
* '''Data Versioning.''' <br>
WCS is implemented in multiple versions: 1.0, 1.12, 2.0. The AQ Community Server (AComServ) is now implemented using WCS 1.1.2. Define here the WCS version (WCS 2.0) issue in about one sentence<br>
+
AVik: He suggested that timestamp of submission of data is a useful versioning approach.
'''Gridded data service through WCS'''<br>
+
** Need additional flags on observation level
This generally works well.
+
** Latest time stamp requires to data ...
* Issue: Extraction of vertical levels?
+
** Version dates could confuse broad range of users
* Issue: Ambiguity of WCS; core plus extended (do we know what is valid?)
+
*** Martin - Applications would use the version date, not user
 +
** Who are the next users in this chain?
 +
* '''GEO AQ CoP.'''
 +
== AQ CoP ==
 +
* RHusar: What is the AQ Community of Practice is/does? [[Media:110824_Solta11_Intro.ppt|CoP Intro PPT]]
 +
** Pic of data pool - started discussion with the idea that CoP should only create distributed data pool and build the data network.  
 +
[[Image:SoltaIntro_1.jpeg|300px]] | [[Image:SoltaIntro_2.jpeg|300px]]
 +
*** Discussion w/in group thought this was too narrow and had issues with how the CoP differed from the facilitators.  
 +
*** Reworded CoP purpose to CoP should "Connect and Enable" AQ data networks, hubs and facilitators so that they can connect and enable the data needed for their systems.
 +
----
 +
 
 +
'''Data Providers''': Existing data Data Hubs - how do they work?<br>
 +
'''Data  Classes'''<br>
 +
* by data source-driver (mandated, research)
 +
* by content/platform (emission, ambient, remsens, model)
 +
* by space-time (global, regional)
 +
'''Data Level'''<br>
 +
* Primary (original), Secondary, Mediated
 +
* Raw , processed , how?
  
'''Delivery of station-point data'''
+
==Tue 18.00-19.00: General Discussion, with wine and cheese==
* Issue: use WCS or WFS, Combination of both?
 
* Access rights?
 
  
'''Data server performance issues/solutions?'''<br>
+
==Wed  8.00-10.00: Breakout reports, general server items==
Define performance issues, measurements<br>
+
<center>'''Location: LeoMar, Domenico/Goussev'''</center>  
'''Server co-development tools, methods'''<br>  
 
Server code is maintained through SourceForge, Darcs code repositories are available at WUSTL and in Juelich.
 
* Issues: Version control, Platform independence, Documentation
 
  
==Tue  16.00-18.00: NoIT: ADN scope, providers, users..==
+
<center><br>'''What few things must be the same, so that everything else can be different?'''<br></center>
  
'''What is the purpose of ADN?'''<br>
+
[[WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type]]
'''Who are the participating users of the network? What are their roles?'''<br>
 
'''What organizations are stakeholders in the network'''? <br>
 
'''How do they relate to ADN?'''<br>
 
  
==Wed  8.00-10.00: Reports from breakout sessions and general discussion==
 
  
 
'''Report from the IT breakout session: Community server software'''<br>
 
'''Report from the IT breakout session: Community server software'''<br>
'''Report from non IT breakout session:ADN scope, providers, users.. '''<br>
+
'''Report from non IT breakout session: ADN scope, providers, users'''<br>
 +
'''Report from the pre-workshop Data Catalog side meeting''' <br>
 +
'''Crossover topics between IT and non-IT issues'''<br>
 +
 
 +
==Wed  10.30-12.30: [[Data network catalog and clients]]==
 +
<center>'''Location: LeoMar, Bigagli/Eckhardt'''</center>
 +
'''Functionality of an Air Quality Data Network Catalog''' Who , what, where, when<br>
 +
'''Catalog content and structure (granularity) of ADNC?''' <br>
 +
'''Minimal metadata for discovery, data provenance, quality, access constrains?'''<br>
 +
'''Single AQ Catalog? Distributed? Service-oriented?, Access rights'''<br>
 +
 
 +
* Ben: Similarity with hydrology
 +
* Stefano: GI-cat - help on CSW implementation
 +
*  Paul: Catalog has to be linked to the data offering services
 +
 
 +
'''Catalog Clients'''<br>
 +
 
 +
==Wed 16.00-18.00: Relationship, cooperation, governance==
 +
<center>'''Location: LeoMar, Nativi/Robinson'''</center>
 +
 
 +
<center>[http://vimeo.com/28437680 Session recording in MP4 format on Vimeo]</center>
 +
 
 +
3 call-in presentations, (10 mins each max!)<br>
 +
* 16:00 US EPA - Terry Keating (HTAP, CyAir..) ...perspective on AQ data networking;
 +
**The contribution from network is mostly to the hemispherical or global scale air pollution but at the same time benefits to the local by specifying what is the contribution of global emissions to the regional AQ.
 +
** HTAP needs AQ network. In that sense it is a client of this network and facilitates the collaboration in science.
 +
** If you connect the data systems, using the same channel we can connect scientists/analysts globally
 +
** AQ Network needs HTAP because it provides focus and application and demonstrate value and creates a demand for investment. Without demonstrating the values in concrete terms it will not be easy to get funding in order to build the infrastructure.
 +
** Cyber infrastructure focused on the data that are needed and used by EPA. But it should be linked to global system. so far it is in planning mode how to move to inter operable system. It is looking for broader community such as CoP to define standard practices.
 +
 
 +
 
 +
 
 +
* 16:10 Nat.Park Serv. - Bret Schichtel (VIEWS) on data collection & usage in VIEWS DSS -
 +
* 16:20 [[Media:Precipitation Presentation with Notes 08-16-2011.pdf|GEO - UIC. Adam Carpenter on GEO Earth Observation Priorities]]
 +
**What kinds of end users does your organization represent and/or interact with?
 +
**What are their specific precipitation data needs?  What is that data needed for?
 +
**Can you provide us documentation?
 +
**Do you have other feedback or comments?
 +
 
  
'''Possible discussion topics/focus on cross-overs between IT and non-IT issues'''<br>  
+
16:30-17:00 Discussion <br>
* standard definitions (clarity, ambiguity, completeness, ...)
 
* standard development and documentation
 
* open-source server software development
 
* platform issues, portability
 
* coding language(s), code interchangeability
 
* coding style and software development approaches
 
*            Data Content
 
* organisation of data
 
* data formats, standard compliance
 
* data access
 
* performance
 
* flexibility
 
* user friendliness
 
* meeting user demands (fitness for purpose)
 
* governance, responsibilities, etc.
 
* Open Source collaborative approach. Issues?
 
* General software design: Multi-layer, Multi-protocol. Standard-Convention driven
 
* Porting, Installation. Issues?
 
* Maintenance, governance. Issues?
 
* Criteria for single (trusted ?) 'primary' data source
 
* Designations for secondary, derived, augmented data sources
 
  
==Wed  10.30-12.30: AQ network: Servers, Catalog, Clients==
+
M. Schultz: Idea of a case study where we demonstrate how the machinery works with the network and in the absence of network.
Preparing the way forward...
 
  
'''What few things must be the same, so that everything else can be different?'''
+
Tim Dye : How do we bring in other new comers?pilot projects?
  
Metadata for finding and understanding, CF, ISO)<br>
+
R Husar: Pilots are OK..but if at all possible the pilots are to be driven by user need rather than doing it for just the sake of demonstration.
Data access/use constrains, quality control, data versioning, etc.<br>
 
'''What is the design philosophy'''<br>
 
Service oriented (everything is a service), Component and network design for change; open source (everything?!) <br>
 
'''Network-level data flow, usage statistics (GoogleAnalytics), performance'''<br>
 
  
... goal is to obtain a good basis for discussion in the following breakout sessions, both from the IT and non-IT sides.
 
  
==Wed  16.00-18.00: IT: Other servers & catalog ==
+
B Schichtel: What can the CoP contributed network offer to me for AQ analysis?
* Server Software Design (uFIND). Issues?
 
'''Functionality of an Air Quality Data Network Catalog (ADNC)?'''<br>
 
'''Content and structure (granularity) of ADNC?''' <br>
 
'''Interoperability of ADNC<br>'''
 
* Interoperability with whom? what standards are needed? CF Naming extensions?<br>
 
* AQ Discovery Metadata Convention (for use in ISO, Data Catalogs...)
 
* Extend CF Naming conventions for Point Data
 
* Devise human-readable CF naming equivalents?
 
'''Access rights and access management'''<br>
 
  
'''What are the generic (ISO, GEOSS, INSPIRE) and the AQ-specific discovery metadata?'''<br>
+
S. Galmarini: Hypothesizing what possible benefits the networks can have is not very productive because it will depend on those who are participating. Its only after the connections are made, one can observe as to what benefit this network actually created.
'''Minimal metadata for data provenance, quality, access constrains?'''<br>
 
'''Single AQ Catalog? Distributed? Service-oriented?'''<br>
 
* GI-cat
 
* uFind
 
  
==Wed 16.00-18.00: NoIT: Links, coop., governance..==
+
T. keating:I think this is exactly the stage that we are at from HTAP perspective. The dream that we had is that we could create a distributed network of modelling information, observational information of various types and emissions information and then be able to compare through web browser based tools be able to access those distributed databases and do analysis comparing models to models , models to observations, models to emissions, emissions to observations etc. I think we are only at a stage now were we develop robustness in connections between some of those distributed databases that now we can begin to build tools to do some of those analyses. I think that the sort of thing that Bret is after is sort of the system we want to be able to develop, to be able to go and access those distributed databases and answer questions. We need to develop  analytical applications that can do that.
  
* User perspective, value chain .. user can not find..
+
B Schichtel: Access to data for emissions, models and observations and possibly tools would also be beneficial to my project in the US national park service.
  
 +
E Robinson:
 +
 +
R Husar: Interoperability has to take place at organization level,data level...(show image)
 +
 +
M.Schultz: Such a collaboration between MACC program and EPA not at program level but also at data sharing level. 10 or 20 years before this would not have been possible.
 +
 +
B.Dominico: I am puzzled what the issues. As I recall, DataFed was available 5 years ago. Community are not using the data?
 +
 +
R Husar: The problem is the slowlessness or the pace at which the AQ networking is evolving... Over the past 5 years businesses have adopted social networking but there was little change in the AQ networking. My feeling is that impediments are not technological. But what are they? Unless we understand the actual impediments, this AQ CoP effort will be just "another wreck along the road" toward interoperable systems. we will address these impediments during the thursday morning session.
 +
 +
B.Dominico: Now I understand some of the issues and also notice that this meeting we have representatives of data systems and technologies but the users of AQ data are not well represented here. 25 years ago when unidata was formed it was the users that got together and declared that some body has to help accessing the data. 
 +
Driving force is the user community.
 +
 +
P. Kjeld : Yesterday I learned that this community has chosen WCS protocol for delivering AQ data. We will then implement a WCS server to share the EEA datasets. However, EEA serves many communities and some required very different data delivery. So which delivery procedure we should focus on?
 +
 +
M. Schultz: This is why we need compelling demonstration of the connected system. For instance in the DataFed viewer and now at Julich you can take data from multiple sources and overlay those as if they were part of single data system. Benefits of switching to new systems to the operational people are not apparent.
 +
 +
T.Keating: Martin is right. In particular education about the approach of sharing is important. When we talk to people, who are running different data systems their response is that "we are sharing the data". But their concept data sharing is that its downloadable. We do have the technologies but lot of people don't know about it and the approaches by which the web service technologies can be used. Community of practice could do educating.
 +
Generate the educational materials.
 +
 +
R. Husar: Stefano Nativi has been a major driver force for introducing informatics as a connector and integrator for earth science activities in Europe. Stefano, what is your perspective?
 +
 +
S. Nativi: I listened to your discussion with considerable interest similar dialogue on why to share and how to share data has been going on bio diversity community with quite successful out comes. They prepared documents, a work plan that outline the activities toward data sharing and integration. It was driven by science. Another contribution of the community is to collect best practices.
 +
 +
R Husar: What you think makes the bio diversity group work so well?
 +
 +
S.Nativi:
 +
 +
R. Husar: Following the comments of Terry and Stefano, this is a reminder to our software developer subgroup that identifying and describing their best practices would be highly desirable.
 +
 +
B.Dominico: The unidata user community made up primarily from meteorologist in academia and other researchers has been interested in accessing real time air quality data. I have been asked to facilitate access to AQ  data so that it could be distributed through Unidata network. I have been approaching EPA and others but my efforts were unsuccessful. I just want to emphasis that there is a community of atmospheric researchers that asks for AQ data.
 +
 +
T. Dye: The need for AQ data may not be that Huge. One of the challenges is that when we install something we have to think about how to run it for the next 10 years. It may be easy for Unidata but not for us. The other point is that we did not quite understand how unidata network operates.
 +
 +
U.Shankar: Here is a statement from the CMAS perspective. It is a rather complicated AQ modelling system. Infrastructure that supports its success typically depend upon community participation that had a major component of  education and training. 
 +
 +
P.Kjeld: We have a production environment and we have to make sure that it is not disturbed with anything we do. So we need to separate the data systems for which we need to allocate people and resources. Also, we are not using all open source software but rather commercial, licensed software that is properly maintained so we try to do little development as possible.
 +
 +
P.Eckhardt: In response to peter, at NILU our prototype WCS server is running on a separate system. Through replication of the operational database into prototype we can shield the operational system from negative impacts.
 +
 +
Stefano: I think we are trying to solve too bigger problem by seeking a universal solution toward the interoperability. What we should do instead of this philosophical discussion is just decide do we want to be in or out from this activity?. If you are in, keep your constraints ( time, money, personal) in mind and work together to solve the problem. with regards to education, the best way to educate is through example. If this community can demonstrate the benefits of the network system others will jump on it.
 +
 +
 +
 +
17:00 - 18:00 User perspective, value chain.<br>
 +
Relationships, cross-thematic links (EGIDA, ESIP), How can we collaborate? Network governance 
 
* http://www.delicious.com/tag/integratinginitiative+governance
 
* http://www.delicious.com/tag/integratinginitiative+governance
  
 +
==Wed 18.00-19.00: General Discussion, with wine and cheese==
 +
 +
<big><center>Group dinner</center></big>
 +
 +
==Thu  8.00-10.00: [[Networking impediments and opportunities]]==
 +
<center>'''Location: LeoMar, Kjeld/Ludewig'''</center>
 +
[http://www.egida-project.eu/images/documents/bogardibonn.pdf JJ Bogardi, Global Water System Project (GWSP)] at EGIDA, Bonn:<br>
 +
'''Nature of Networking Projects:''' Complex funding; mixture of paid and voluntary; multiple obligations; international; differing project maturity; governance, cultures <br>
 +
'''Which glue keeps it together?''' Trust and personal affinity. Common objectives and scientific values. Mutual respect. Mutual benefit (win-win). Complementarity. Donor dictate <br>
 +
'''Lethal“ ingredients.''' Turf mentality. Budget discrepancies. Too much competition. Lack of data and information exchange. Donor jealousy<br>
 +
 +
* Clear statements about obstacles: no clear structure; no dedicated funding and no clear idea(s) yet how to do it.
 +
* Opportunities, fixes: ID manageable work packages; Find, organize, distribute reusable components, resources; Create win-win situation
 +
 +
==Thu  10.30-12.30: [[AQ Community Metadata Discussion]] ==
 +
<center>'''Location: LeoMar, Galmarini/Dye'''</center>
  
  
Relationships, cross-thematic links (EGIDA, ESIP)
 
How can we collaborate? 
 
<big><center>Group dinner</center></big>
 
  
==Thu  8.00-10.00: Networking impediments, opportunities==
 
  
* Clear statements about obstacles..
 
** no organizational structure,
 
** no dedicated funding or
 
** no clear idea yet how to do it (or several independent ideas?).
 
  
* Opportunities, fixes
 
** Identification of manageable work packages
 
** Reusable components, resources
 
----
 
JJ Bogardi, Global Water System Project (GWSP) at EGIDA, Bonn:<br>
 
'''Nature of Networking Projects:''' Complex funding .., multiple obligations. Interdisciplinary and international. Differing project maturity. Mixture of paid and voluntary contributors. Governance and project cultures may differ. <br>
 
'''Which glue keeps it together?''' Trust and personal affinity. Common objectives and scientific values. Mutual respect. Mutual benefit (win-win). Complementarity. Donor dictate <br>
 
'''„Lethal“ ingredients.''' Turf mentality. Budget discrepancies. Too much competition. Lack of data and information exchange. Donor jealousy<br>
 
 
----
 
----
What can we do to achieve a "win-win" situation?
 
 
==Thu  10.30-12.30: ADN user relations/help..whom, what?==
 
 
* Target User communities
 
* Target User communities
 
* Use cases for different applications... from scientists to managers, media people and the general public)?
 
* Use cases for different applications... from scientists to managers, media people and the general public)?
 
* How can users find out about (each) system? Big future issues: data quality, traceability, metadata..
 
* How can users find out about (each) system? Big future issues: data quality, traceability, metadata..
  
'''Relationship to non-AComServ WCS servers'''<br>
+
==Thu  16.00-18.00: [[Workshop outputs, outcomes, plans? ]]==
* Issue: protocol compatibility, standard compliance, data format(s)
+
<center>'''Location: LeoMar, 'Schultz/Husar'''</center>
* Issue: is there a need? How can we benefit from "other" data? How can they benefit from AQ data?
+
<center>[http://vimeo.com/28432407 Session recording in MP4 format on Vimeo]</center>
* Issue: which protocols? (OpenDAP?, GIS servers?)
+
'''What are the anticipated outputs?''' Agreement on [[WCS_Server_Software|community WCS server]] for grid and point data; server governance, distributed catalog; workshop summary<br>
 +
'''What are the anticipated outcomes?''' More servers and data added to the shared data pool and more willing users of shared data.<br>
 +
 
 +
==Thu 18.00-19.00: General Discussion, with wine and cheese==
  
==Thu  16.00-18.00: Outputs, outcomes, plans ? ==
 
'''What are the anticipated outputs?''' Agreement on [[WCS_Server_Software|community WCS server]] for grid and point data; server governance, distributed catalog; workshop summary<br>
 
'''What are the anticipated outcomes?''' Better understanding of the network, higher level of trust and ''concrete steps toward turning the ADN from virtual to real'' <br>
 
What are the short-term opportunities?  Do we have
 
Common long-term goals and visions? 
 
 
<big><center>Friday: Boat trip</center></big>
 
<big><center>Friday: Boat trip</center></big>

Latest revision as of 10:22, July 28, 2012

< Back to AQ CoP.png | Workshops | Air Quality Data Network

Monday 16:00-18:00: Data Catalog Side Meeting ||| Discussion

Location: LeoMar, Boldrini/Bigagli

GI-cat - ESSI Federated Catalog (aka EuroGEOSS Discovery Broker, part of GEOSS Common Infrastructure)
AQComCat - Air Quality Community Catalog
GI-cat-AQComCat link - Federating AQComCat into GI-cat; Accessing GI-cat from AQComCat

Monday 18:00-19:00 Social

Tue 08.00-10.00: Self-Introduction, 5 mins/participant

Location: LeoMar, Husar/Vidic

Welcome
Logistics
Agenda and Procedures

  • Daily 8:00-12:30; 16:00-19:00; Coffee break and 18:-19:00 (1.5 h) informal interaction
  • Total of ten two-hour sessions.
    • First half focused on data server, catalog software
    • Second half on more general AQ data networking
    • Two rapporteurs for each session

Self-Introduction - by each participant

  • Slides 1-2: Name, institution, relevant research (on interoperability, networking), participation in major projects/programs
  • Slide(s) 3-(4): What would you like to take away from the workshop; what would you like to offer to the workshop (or AQ CoP)

Tue 10.30-12.30: Introduction of Hubs, 5 min each

Location: LeoMar, Schultz/Bernonville

Intro of AQ Data Hubs - by their representatives

  • AQ Community Servers: Common DataFed, FJ Juelich, NGC/CIERA, EBAS
  • Other Servers: DLR/ACP, AIRNow, EEA NRT, AQMEII (AeroCom, RSIG, GIOVANNI)

Air Quality Community of Practice (AQ CoP) - R Husar
Air Quality Community Data Server - M. Schultz

Tue 16.00-18.00: IT Breakout: AQ Community server software

Location: LeoMar, Decker/Hoijarvi

Use of netCDF and other data formats
Gridded data service through WCS
Station-point data service (SQL)
Data server performance issues/solutions; Server co-development tools
Relationship to other WCS servers; Real Data-to-WCS/WFS/WMS-Mapping
Notes

Tue 16.00-18.00: Scope and Type of Data to be Served

Location: LeoMar, Vik/Fialkowski


  • Ensemble System (JRC) Presentation from Stefano Galmarini on AQMEII and ENSEMBLE (Data Hub/Facilitator)
    • Witten in Perl and IDL
    • Comparisons and evaluation of models
    • Coordinated model harmonization across 27 models
    • Also a reposition for the dissemination of model and AQ measurements
    • Produced 4-page tech spec docs that were sent to groups for supplying data; this was key step in making the process smother.
    • Main features
      • Transfer of very large model output across internet
      • Storage of 1,2,3D model data
      • Quick access of large dataset
      • Distribution of KML and WMS (in progress)
    • Use an ASCII format file to describe the model files
    • Have a program call ENFORM (Fortran) that transforms model data into the compressed dataset (1.2GB → 200MB)
    • Can produce GE projected files (nice)
    • 11 papers came out in a special journal edition
  • Data Versioning.

AVik: He suggested that timestamp of submission of data is a useful versioning approach.

    • Need additional flags on observation level
    • Latest time stamp requires to data ...
    • Version dates could confuse broad range of users
      • Martin - Applications would use the version date, not user
    • Who are the next users in this chain?
  • GEO AQ CoP.

AQ CoP

  • RHusar: What is the AQ Community of Practice is/does? CoP Intro PPT
    • Pic of data pool - started discussion with the idea that CoP should only create distributed data pool and build the data network.

SoltaIntro 1.jpeg | SoltaIntro 2.jpeg

      • Discussion w/in group thought this was too narrow and had issues with how the CoP differed from the facilitators.
      • Reworded CoP purpose to CoP should "Connect and Enable" AQ data networks, hubs and facilitators so that they can connect and enable the data needed for their systems.

Data Providers: Existing data Data Hubs - how do they work?
Data Classes

  • by data source-driver (mandated, research)
  • by content/platform (emission, ambient, remsens, model)
  • by space-time (global, regional)

Data Level

  • Primary (original), Secondary, Mediated
  • Raw , processed , how?

Tue 18.00-19.00: General Discussion, with wine and cheese

Wed 8.00-10.00: Breakout reports, general server items

Location: LeoMar, Domenico/Goussev

What few things must be the same, so that everything else can be different?

WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type


Report from the IT breakout session: Community server software
Report from non IT breakout session: ADN scope, providers, users
Report from the pre-workshop Data Catalog side meeting
Crossover topics between IT and non-IT issues

Wed 10.30-12.30: Data network catalog and clients

Location: LeoMar, Bigagli/Eckhardt

Functionality of an Air Quality Data Network Catalog Who , what, where, when
Catalog content and structure (granularity) of ADNC?
Minimal metadata for discovery, data provenance, quality, access constrains?
Single AQ Catalog? Distributed? Service-oriented?, Access rights

  • Ben: Similarity with hydrology
  • Stefano: GI-cat - help on CSW implementation
  • Paul: Catalog has to be linked to the data offering services

Catalog Clients

Wed 16.00-18.00: Relationship, cooperation, governance

Location: LeoMar, Nativi/Robinson
Session recording in MP4 format on Vimeo

3 call-in presentations, (10 mins each max!)

  • 16:00 US EPA - Terry Keating (HTAP, CyAir..) ...perspective on AQ data networking;
    • The contribution from network is mostly to the hemispherical or global scale air pollution but at the same time benefits to the local by specifying what is the contribution of global emissions to the regional AQ.
    • HTAP needs AQ network. In that sense it is a client of this network and facilitates the collaboration in science.
    • If you connect the data systems, using the same channel we can connect scientists/analysts globally
    • AQ Network needs HTAP because it provides focus and application and demonstrate value and creates a demand for investment. Without demonstrating the values in concrete terms it will not be easy to get funding in order to build the infrastructure.
    • Cyber infrastructure focused on the data that are needed and used by EPA. But it should be linked to global system. so far it is in planning mode how to move to inter operable system. It is looking for broader community such as CoP to define standard practices.


  • 16:10 Nat.Park Serv. - Bret Schichtel (VIEWS) on data collection & usage in VIEWS DSS -
  • 16:20 GEO - UIC. Adam Carpenter on GEO Earth Observation Priorities
    • What kinds of end users does your organization represent and/or interact with?
    • What are their specific precipitation data needs? What is that data needed for?
    • Can you provide us documentation?
    • Do you have other feedback or comments?


16:30-17:00 Discussion

M. Schultz: Idea of a case study where we demonstrate how the machinery works with the network and in the absence of network.

Tim Dye : How do we bring in other new comers?pilot projects?

R Husar: Pilots are OK..but if at all possible the pilots are to be driven by user need rather than doing it for just the sake of demonstration.


B Schichtel: What can the CoP contributed network offer to me for AQ analysis?

S. Galmarini: Hypothesizing what possible benefits the networks can have is not very productive because it will depend on those who are participating. Its only after the connections are made, one can observe as to what benefit this network actually created.

T. keating:I think this is exactly the stage that we are at from HTAP perspective. The dream that we had is that we could create a distributed network of modelling information, observational information of various types and emissions information and then be able to compare through web browser based tools be able to access those distributed databases and do analysis comparing models to models , models to observations, models to emissions, emissions to observations etc. I think we are only at a stage now were we develop robustness in connections between some of those distributed databases that now we can begin to build tools to do some of those analyses. I think that the sort of thing that Bret is after is sort of the system we want to be able to develop, to be able to go and access those distributed databases and answer questions. We need to develop analytical applications that can do that.

B Schichtel: Access to data for emissions, models and observations and possibly tools would also be beneficial to my project in the US national park service.

E Robinson:

R Husar: Interoperability has to take place at organization level,data level...(show image)

M.Schultz: Such a collaboration between MACC program and EPA not at program level but also at data sharing level. 10 or 20 years before this would not have been possible.

B.Dominico: I am puzzled what the issues. As I recall, DataFed was available 5 years ago. Community are not using the data?

R Husar: The problem is the slowlessness or the pace at which the AQ networking is evolving... Over the past 5 years businesses have adopted social networking but there was little change in the AQ networking. My feeling is that impediments are not technological. But what are they? Unless we understand the actual impediments, this AQ CoP effort will be just "another wreck along the road" toward interoperable systems. we will address these impediments during the thursday morning session.

B.Dominico: Now I understand some of the issues and also notice that this meeting we have representatives of data systems and technologies but the users of AQ data are not well represented here. 25 years ago when unidata was formed it was the users that got together and declared that some body has to help accessing the data. Driving force is the user community.

P. Kjeld : Yesterday I learned that this community has chosen WCS protocol for delivering AQ data. We will then implement a WCS server to share the EEA datasets. However, EEA serves many communities and some required very different data delivery. So which delivery procedure we should focus on?

M. Schultz: This is why we need compelling demonstration of the connected system. For instance in the DataFed viewer and now at Julich you can take data from multiple sources and overlay those as if they were part of single data system. Benefits of switching to new systems to the operational people are not apparent.

T.Keating: Martin is right. In particular education about the approach of sharing is important. When we talk to people, who are running different data systems their response is that "we are sharing the data". But their concept data sharing is that its downloadable. We do have the technologies but lot of people don't know about it and the approaches by which the web service technologies can be used. Community of practice could do educating. Generate the educational materials.

R. Husar: Stefano Nativi has been a major driver force for introducing informatics as a connector and integrator for earth science activities in Europe. Stefano, what is your perspective?

S. Nativi: I listened to your discussion with considerable interest similar dialogue on why to share and how to share data has been going on bio diversity community with quite successful out comes. They prepared documents, a work plan that outline the activities toward data sharing and integration. It was driven by science. Another contribution of the community is to collect best practices.

R Husar: What you think makes the bio diversity group work so well?

S.Nativi:

R. Husar: Following the comments of Terry and Stefano, this is a reminder to our software developer subgroup that identifying and describing their best practices would be highly desirable.

B.Dominico: The unidata user community made up primarily from meteorologist in academia and other researchers has been interested in accessing real time air quality data. I have been asked to facilitate access to AQ data so that it could be distributed through Unidata network. I have been approaching EPA and others but my efforts were unsuccessful. I just want to emphasis that there is a community of atmospheric researchers that asks for AQ data.

T. Dye: The need for AQ data may not be that Huge. One of the challenges is that when we install something we have to think about how to run it for the next 10 years. It may be easy for Unidata but not for us. The other point is that we did not quite understand how unidata network operates.

U.Shankar: Here is a statement from the CMAS perspective. It is a rather complicated AQ modelling system. Infrastructure that supports its success typically depend upon community participation that had a major component of education and training.

P.Kjeld: We have a production environment and we have to make sure that it is not disturbed with anything we do. So we need to separate the data systems for which we need to allocate people and resources. Also, we are not using all open source software but rather commercial, licensed software that is properly maintained so we try to do little development as possible.

P.Eckhardt: In response to peter, at NILU our prototype WCS server is running on a separate system. Through replication of the operational database into prototype we can shield the operational system from negative impacts.

Stefano: I think we are trying to solve too bigger problem by seeking a universal solution toward the interoperability. What we should do instead of this philosophical discussion is just decide do we want to be in or out from this activity?. If you are in, keep your constraints ( time, money, personal) in mind and work together to solve the problem. with regards to education, the best way to educate is through example. If this community can demonstrate the benefits of the network system others will jump on it.


17:00 - 18:00 User perspective, value chain.
Relationships, cross-thematic links (EGIDA, ESIP), How can we collaborate? Network governance

Wed 18.00-19.00: General Discussion, with wine and cheese

Group dinner

Thu 8.00-10.00: Networking impediments and opportunities

Location: LeoMar, Kjeld/Ludewig

JJ Bogardi, Global Water System Project (GWSP) at EGIDA, Bonn:
Nature of Networking Projects: Complex funding; mixture of paid and voluntary; multiple obligations; international; differing project maturity; governance, cultures
Which glue keeps it together? Trust and personal affinity. Common objectives and scientific values. Mutual respect. Mutual benefit (win-win). Complementarity. Donor dictate
Lethal“ ingredients. Turf mentality. Budget discrepancies. Too much competition. Lack of data and information exchange. Donor jealousy

  • Clear statements about obstacles: no clear structure; no dedicated funding and no clear idea(s) yet how to do it.
  • Opportunities, fixes: ID manageable work packages; Find, organize, distribute reusable components, resources; Create win-win situation

Thu 10.30-12.30: AQ Community Metadata Discussion

Location: LeoMar, Galmarini/Dye




  • Target User communities
  • Use cases for different applications... from scientists to managers, media people and the general public)?
  • How can users find out about (each) system? Big future issues: data quality, traceability, metadata..

Thu 16.00-18.00: Workshop outputs, outcomes, plans?

Location: LeoMar, 'Schultz/Husar
Session recording in MP4 format on Vimeo

What are the anticipated outputs? Agreement on community WCS server for grid and point data; server governance, distributed catalog; workshop summary
What are the anticipated outcomes? More servers and data added to the shared data pool and more willing users of shared data.

Thu 18.00-19.00: General Discussion, with wine and cheese

Friday: Boat trip