Difference between revisions of "Solta 2011 Agenda"

From Earth Science Information Partners (ESIP)
(Reverted edits by 207.46.13.145 (talk) to last revision by Martin Schultz (MartinSchultz))
 
(121 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
<noinclude>{{AQ CoP Solta2011 Backlinks}}</noinclude>
 
<noinclude>{{AQ CoP Solta2011 Backlinks}}</noinclude>
  
'''Monday Evening:''' Registration and Social 
+
== Monday 16:00-18:00: [[Data Catalog Side Meeting]]  ||| [[Talk:Data_Catalog_Side_Meeting|Discussion]] ==
 +
<center>'''Location: LeoMar, Boldrini/Bigagli'''</center>
 +
'''GI-cat''' - ESSI Federated Catalog (aka EuroGEOSS Discovery Broker, part of GEOSS Common Infrastructure)<br>
 +
'''AQComCat''' - Air Quality Community Catalog<br>
 +
'''GI-cat-AQComCat link''' - Federating AQComCat into GI-cat; Accessing GI-cat from AQComCat<br>
  
==Tue 8.00-10.00: Self-Introduction, 5 mins/participant==
+
== Monday 18:00-19:00 Social ==
<center>'''Husar/Vidic'''</center> <br>
+
==Tue 08.00-10.00: Self-Introduction, 5 mins/participant==
 +
<center>'''Location: LeoMar, Husar/Vidic'''</center>  
 +
'''Welcome'''<br>
 +
'''Logistics'''<br>
 +
'''Agenda and Procedures'''<br>
 +
* Daily 8:00-12:30; 16:00-19:00; Coffee break and 18:-19:00 (1.5 h) informal interaction
 +
* Total of ten two-hour sessions.
 +
** First half focused on data server, catalog software
 +
** Second half on more general AQ data networking
 +
** Two rapporteurs  for each session 
 +
----
 +
'''Self-Introduction''' - by each participant
 +
*Slides 1-2: Name, institution, relevant research (on interoperability, networking), participation in major projects/programs
 +
*Slide(s) 3-(4): What would you like to take away from the workshop; what would you like to offer to the workshop (or AQ CoP)
 +
 
 +
==Tue  10.30-12.30: Introduction of Hubs, 5 min each==
 +
<center>'''Location: LeoMar, Schultz/Bernonville'''</center>
 +
'''Intro of AQ Data Hubs''' - by their representatives<br>
 +
* AQ Community Servers: Common  DataFed, FJ Juelich, NGC/CIERA,  EBAS
 +
* Other Servers: DLR/ACP, AIRNow, EEA NRT, AQMEII  (AeroCom, RSIG, GIOVANNI)<br>
 +
'''Air Quality Community of Practice (AQ CoP)''' - R Husar<br>
 +
'''Air Quality Community Data Server''' - M. Schultz<br>
 +
 
 +
==Tue  16.00-18.00:  IT Breakout: [[AQ Community server software]]==
 +
<center>'''Location: LeoMar, Decker/Hoijarvi'''</center>
 +
'''Use of netCDF and other data formats'''<br>
 +
'''Gridded data service through WCS'''<br>
 +
'''Station-point data service (SQL)'''<br>
 +
'''Data server performance issues/solutions; Server co-development tools'''<br>  
 +
'''Relationship to other WCS servers; Real Data-to-WCS/WFS/WMS-Mapping'''<br>
 +
[[Talk:Candidate_Technical_Topics|Notes]]
 +
 
 +
==Tue  16.00-18.00: Scope and Type of Data to be Served==
 +
<center>'''Location: LeoMar, Vik/Fialkowski'''</center> <br>
 +
 
 +
*  '''Ensemble System''' (JRC) Presentation from Stefano Galmarini on AQMEII and ENSEMBLE (Data Hub/Facilitator)
 +
** Witten in Perl and IDL
 +
** Comparisons and evaluation of models
 +
** Coordinated model harmonization across 27 models
 +
** Also a reposition for the dissemination of model and AQ measurements
 +
** Produced 4-page tech spec docs that were sent to groups for supplying data; this was key step in making the process smother.
 +
** Main features
 +
*** Transfer of very large model output across internet
 +
*** Storage of 1,2,3D model data
 +
*** Quick access of large dataset
 +
*** Distribution of KML and WMS (in progress)
 +
** Use an ASCII format file to describe the model files
 +
** Have a program call ENFORM (Fortran) that transforms model data into the compressed dataset (1.2GB → 200MB)
 +
** Can produce GE projected files (nice)
 +
** 11 papers came out in a special journal edition
 +
 
 +
* '''Data Versioning.''' <br>
 +
AVik: He suggested that timestamp of submission of data is a useful versioning approach.
 +
** Need additional flags on observation level
 +
** Latest time stamp requires to data ...
 +
** Version dates could confuse broad range of users
 +
*** Martin - Applications would use the version date, not user
 +
** Who are the next users in this chain?
 +
* '''GEO AQ CoP.'''
 +
== AQ CoP ==
 +
* RHusar: What is the AQ Community of Practice is/does? [[Media:110824_Solta11_Intro.ppt|CoP Intro PPT]]
 +
** Pic of data pool - started discussion with the idea that CoP should only create distributed data pool and build the data network.
 +
[[Image:SoltaIntro_1.jpeg|300px]] | [[Image:SoltaIntro_2.jpeg|300px]]
 +
*** Discussion w/in group thought this was too narrow and had issues with how the CoP differed from the facilitators.
 +
*** Reworded CoP purpose to CoP should "Connect and Enable" AQ data networks, hubs and facilitators so that they can connect and enable the data needed for their systems.
 +
----
 +
 
 +
'''Data Providers''': Existing data Data Hubs - how do they work?<br>
 +
'''Data  Classes'''<br>
 +
* by data source-driver (mandated, research)
 +
* by content/platform (emission, ambient, remsens, model)
 +
* by space-time (global, regional)
 +
'''Data Level'''<br>
 +
* Primary (original), Secondary, Mediated
 +
* Raw , processed , how?
 +
 
 +
==Tue 18.00-19.00: General Discussion, with wine and cheese==
 +
 
 +
==Wed  8.00-10.00: Breakout reports, general server items==
 +
<center>'''Location: LeoMar, Domenico/Goussev'''</center>
 +
 
 +
<center><br>'''What few things must be the same, so that everything else can be different?'''<br></center>
 +
 
 +
[[WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type]]
  
In order to make efficient use of our time in Croatia, we ask you all to prepare for the workshop in the following ways (aside from arranging your travel etc.):
 
*Slides 1-2: Name, Institution, Relevant research, development or organizational work on AQ data system interoperability and networking
 
*Slide(s) 3-(4): Involvement and participation in projects, programs, i.e. list of Integrating Initiatives. Potential contributions.
 
  
==Tue  10.30-12.30: Introduction of Hubs, 5 min each, CoP 5 min==
+
'''Report from the IT breakout session: Community server software'''<br>
<center>'''Schultz/Bernonville'''</center> <br>
+
'''Report from non IT breakout session: ADN scope, providers, users'''<br>
This session will be a status report from major data hubs<br>
+
'''Report from the pre-workshop Data Catalog side meeting''' <br>
* DataFed
+
'''Crossover topics between IT and non-IT issues'''<br>
* FJ Juelich
 
* NGC/CIERA
 
* EBAS
 
* DLR/ACP
 
* EEA NRT
 
* AIRNow
 
* AQMEII
 
--Afternoon 16:00 US Nodes: VIEWS, AeroCom, RSIG, GIOVANNI --<br>
 
Existing Data Hubs <br>  
 
* Hubs are established organizations to deliver AQ data
 
* Part of their mandate is to integrate, harmonize data
 
* The data offerings are directed toward clients
 
* Provide data through conventional data transfer
 
  
--- Current - Network comparison---
+
==Wed  10.30-12.30: [[Data network catalog and clients]]==
[http://www.nap.edu/catalog.php?record_id=12916 Data integration] is [http://www.ifi.uzh.ch/stff/pziegler/papers/ZieglerWCC2004.pdf pursued for]:
+
<center>'''Location: LeoMar, Bigagli/Eckhardt'''</center>
* Facilitating access and reuse through a single access point
+
'''Functionality of an Air Quality Data Network Catalog''' Who , what, where, when<br>
* Providing more comprehensive information by combining complementing data.
+
'''Catalog content and structure (granularity) of ADNC?''' <br>
 +
'''Minimal metadata for discovery, data provenance, quality, access constrains?'''<br>
 +
'''Single AQ Catalog? Distributed? Service-oriented?, Access rights'''<br>
  
Data hubs already perform data integration within their respective domains. AQ data networking extends the scope of the integration by connecting
+
* Ben: Similarity with hydrology
 +
* Stefano: GI-cat - help on CSW implementation
 +
*  Paul: Catalog has to be linked to the data offering services
  
* Integrating the integrating hubs (non-intrusively!)
+
'''Catalog Clients'''<br>
* Use standard interfacing protocol for lose coupling i. e. networking
 
* Generic processing services that are applicable to all data
 
  
 +
==Wed 16.00-18.00: Relationship, cooperation, governance==
 +
<center>'''Location: LeoMar, Nativi/Robinson'''</center>
  
'''Data Catalogs'''<br>
+
<center>[http://vimeo.com/28437680 Session recording in MP4 format on Vimeo]</center>
* AQ Community Catalog
 
* GI-cat
 
  
 +
3 call-in presentations, (10 mins each max!)<br>
 +
* 16:00 US EPA - Terry Keating (HTAP, CyAir..) ...perspective on AQ data networking;
 +
**The contribution from network is mostly to the hemispherical or global scale air pollution but at the same time benefits to the local by specifying what is the contribution of global emissions to the regional AQ.
 +
** HTAP needs AQ network. In that sense it is a client of this network and facilitates the collaboration in science.
 +
** If you connect the data systems, using the same channel we can connect scientists/analysts globally
 +
** AQ Network needs HTAP because it provides focus and application and demonstrate value and creates a demand for investment. Without demonstrating the values in concrete terms it will not be easy to get funding in order to build the infrastructure.
 +
** Cyber infrastructure focused on the data that are needed and used by EPA. But it should be linked to global system. so far it is in planning mode how to move to inter operable system. It is looking for broader community such as CoP to define standard practices.
  
  
Networking
 
"what few things must be the same
 
# Hubs expose a fraction of their holdings as standards-based data access web services services
 
# Data resources are 
 
* Role of GEO AQ CoP<br>
 
* Role of Integrating Initiatives
 
  
==Tue  16.00-18.00: IT: Community server software==
+
* 16:10 Nat.Park Serv. - Bret Schichtel (VIEWS) on data collection & usage in VIEWS DSS -
<center>'''Decker/Hoijarvi'''</center> <br>
+
* 16:20 [[Media:Precipitation Presentation with Notes 08-16-2011.pdf|GEO - UIC. Adam Carpenter on GEO Earth Observation Priorities]]
 +
**What kinds of end users does your organization represent and/or interact with?
 +
**What are their specific precipitation data needs?  What is that data needed for?
 +
**Can you provide us documentation?
 +
**Do you have other feedback or comments?
  
This sessions is standards and conventions | Implementation for gridded and station data | Development tools | Server performance<br>
 
  
'''Issues re. the use of netCDF and other data formats'''<br>
+
16:30-17:00 Discussion <br>
netCDF is standard format for multi-dimensional data. Cf-netCDF is used both as an archival format of grid data as well as a payload format for WCS queries.
 
* Issue: ambiguity and completeness of CF
 
** Issue: CF (udunits) time format not the same as ISO Time format (as used by WCS)
 
** Issue: geo-referencing (also see CF-ML discussion "the need to store lat/lon coordinates in a CF-compliant netCDF file")
 
** What is missing in CF?
 
** sever independent CF-API Package
 
* Issue: We should define a standard python interface (PyNIO, python-netcdf4, scipy.io.netcdf?)
 
* Issue: other ouput formats
 
** support fused into server or add-on concept (possibly using the public W*S/NetCDF interface)
 
** Delivery of (small) data sets in ASCII/csv format?
 
* Issue: Reading other gridded input data formats? (i.e. GRIB)
 
  
'''Data server performance issues/solutions'''
+
M. Schultz: Idea of a case study where we demonstrate how the machinery works with the network and in the absence of network.
* Issue: especially big datasets take a long time to prepare for delivery (slicing/subsetting, etc.)
 
** direct streaming of datasets to the client could be part of the solution, [[Streaming_and_or_netCDF_File|click here]] for details
 
** generated datasets could be cached for a while, so they could be delivered again when there is a request with compatible parameters
 
** problem: both proposals might be mutually exclusive to some degree
 
* Issue: XML Metadata assembly might take a long time depending on the catalogue content, i.e. with a lot of Identifiers
 
** GetCapabilities response Metadata is very static anyway, other responses (DescribeCoverage) could be cached for a while
 
*** attention: DescribeCoverage response depends on parameters
 
* Issue: management overhead when opening NetCDF
 
** when opening a NetCDF file, some metadata has to be read and data structures have to be set up
 
*** input files could be kept open for a while to avoid this overhead
 
* Issue: temp file space is limited on WCS server
 
** streaming approach for store=false parameter would not requrie additional local storage
 
** temp file approach for store=true parameter could be limited by a maximum dataset size
 
*** requires a reliable output file size estimator
 
*** server would return an exception if estimated size is over given threshold
 
*** would force people to use store=false for large datasets
 
*** should not violate WCS 1.1 standard (too badly) as only store=false is mandatory
 
  
'''Data Servers: Technical Realization (IT) Issues and Solutions'''<br>
+
Tim Dye : How do we bring in other new comers?pilot projects?
* which W*S protocol for which purpose, how to combine?
 
** WMS for display/preview of spatial data
 
** WFS .. for station description/spatial metadata?
 
** WCS for "everything else"? (gridded ("raw") datasets)
 
* WCS Data structure hierarchy: DataHub; Service; Coverage: Field; Flag
 
** WCS 1.1 terminology: Service->Group of similar datasets; Coverage->Dataset; Field->Parameter; Flag->Flag
 
  
'''Gridded data service through WCS'''<br>
+
R Husar: Pilots are OK..but if at all possible the pilots are to be driven by user need rather than doing it for just the sake of demonstration.
WCS is implemented in multiple versions: 1.0, 1.12, 2.0. The AQ Community Server (AComServ) is now implemented using WCS 1.1.2. This generally works well.
+
 
* Issue: Extraction of vertical levels
+
 
* Issue: current state of WCS 2.0?
+
B Schichtel: What can the CoP contributed network offer to me for AQ analysis?
** core relased, but extensions still in draft (how do we know what is valid?)
+
 
* Issue: serve "virtual" WCS datasets with continuous time line assembled from many source files
+
S. Galmarini: Hypothesizing what possible benefits the networks can have is not very productive because it will depend on those who are participating. Its only after the connections are made, one can observe as to what benefit this network actually created.
** create a "wrapper" module that can handle such cases?
+
 
** Kari has already done something like this for HTAP datasets, this could be a starting point
+
T. keating:I think this is exactly the stage that we are at from HTAP perspective. The dream that we had is that we could create a distributed network of modelling information, observational information of various types and emissions information and then be able to compare through  web browser based tools be able to access those distributed databases and do analysis comparing models to models , models to observations, models to emissions, emissions to observations etc. I think we are only at a stage now were we develop robustness in connections between some of those distributed databases that now we can begin to build tools to do some of those analyses. I think that the sort of thing that Bret is after is sort of the system we want to be able to develop, to be able to go and access those distributed databases and answer questions. We need to develop  analytical applications that can do that.
* Issue: desirable time filtering options in WCS: hour of day, day of week, day of month, etc.
+
 
** Kari has already created such filters, but so far they are outside the standard
+
B Schichtel: Access to data for emissions, models and observations and possibly tools would also be beneficial to my project in the US national park service.
 +
 
 +
E Robinson:
 +
 
 +
R Husar: Interoperability has to take place at organization level,data level...(show image)
 +
 
 +
M.Schultz: Such a collaboration between MACC program and EPA not at program level but also at data sharing level. 10 or 20 years before this would not have been possible.
 +
 
 +
B.Dominico: I am puzzled what the issues. As I recall, DataFed was available 5 years ago. Community are not using the data?
 +
 
 +
R Husar: The problem is the slowlessness or the pace at which the AQ networking is evolving... Over the past 5 years businesses have adopted social networking but there was little change in the AQ networking. My feeling is that impediments are not technological. But what are they? Unless we understand the actual impediments, this AQ CoP effort will be just "another wreck along the road" toward interoperable systems. we will address these impediments during the thursday morning session.
 +
 
 +
B.Dominico: Now I understand some of the issues and also notice that this meeting we have representatives of data systems and technologies but the users of AQ data are not well represented here. 25 years ago when unidata was formed it was the users that got together and declared that some body has to help accessing the data.  
 +
Driving force is the user community.
  
'''Delivery of station-point data'''<br>
+
P. Kjeld : Yesterday I learned that this community has chosen WCS protocol for delivering AQ data. We will then implement a WCS server to share the EEA datasets. However, EEA serves many communities and some required very different data delivery. So which delivery procedure we should focus on?
* Issue: use WCS or WFS, Combination of both/which combination?  
 
  
'''Access rights'''<br>
+
M. Schultz: This is why we need compelling demonstration of the connected system. For instance in the DataFed viewer and now at Julich you can take data from multiple sources and overlay those as if they were part of single data system. Benefits of switching to new systems to the operational people are not apparent.
* Issue: technical options to restrict access to datasets?
 
  
'''Server co-development tools, methods'''<br>
+
T.Keating: Martin is right. In particular education about the approach of sharing is important. When we talk to people, who are running different data systems their response is that "we are sharing the data". But their concept data sharing is that its downloadable. We do have the technologies but lot of people don't know about it and the approaches by which the web service technologies can be used. Community of practice could do educating.
Server code is maintained through SourceForge, Darcs code repositories are available at WUSTL and in Juelich.
+
Generate the educational materials.
* Issues: Platform independence (netcdf interface), Documentation
 
  
'''Relationship to non-AComServ (non-NetCDF) WCS servers'''<br>
+
R. Husar: Stefano Nativi has been a major driver force for introducing informatics as a connector and integrator for earth science activities in Europe. Stefano, what is your perspective?
* data format(s)
 
** many WCS clients don't understand NetCDF
 
* Issue: protocol compatibility
 
** might need to implement more optional features of WCS
 
* standard compliance
 
** will need a test suite for 1.1.2 (and manage to run it)
 
  
'''Real Data-to-WCS-Mapping tructure'''<br>
+
S. Nativi: I listened to your discussion with considerable interest similar dialogue on why to share and how to share data has been going on bio diversity community with quite successful out comes. They prepared documents, a work plan that outline the activities toward data sharing and integration. It was driven by science. Another contribution of the community is to collect best practices.
* Data hub that exposes the data ==> Provider    ==>  WCS Service 
 
* Observation platform or network ==> Dataset    ==>  WCS Coverage
 
* Observation parameter/variable ==> Parameter ==> WCS Field
 
  
==Tue  16.00-18.00: NoIT: ADN scope, providers, users==
+
R Husar: What you think makes the bio diversity group work so well?
<center>'''Vik/Fialkowski'''</center> <br>
 
----
 
** VIEWS
 
** GIOVANNI
 
** RSIG
 
** AEROCom
 
  
==Wed  8.00-10.00: Breakout reports, general server items==
+
S.Nativi:
<center>''''Eckhardt, Gaussev'''</center> <br>
 
'''Report from the IT breakout session: Community server software'''<br>
 
'''Report from non IT breakout session: ADN scope, providers, users.. '''<br>
 
"'Report from the pre-workshop Data Catalog side meeting''' 
 
'''Possible discussion topics/focus on cross-overs between IT and non-IT issues'''<br>
 
* standard definitions (clarity, ambiguity, completeness, ...)
 
* standard development and documentation
 
* open-source server software development
 
* platform issues, portability
 
* coding language(s), code interchangeability
 
* coding style and software development approaches
 
*            Data Content
 
* organisation of data
 
* data formats, standard compliance
 
* data access
 
* performance
 
* flexibility
 
* user friendliness
 
* meeting user demands (fitness for purpose)
 
* governance, responsibilities, etc.
 
* Open Source collaborative approach. Issues?
 
* General software design: Multi-layer, Multi-protocol. Standard-Convention driven
 
* Porting, Installation. Issues?
 
* Maintenance, governance. Issues?
 
* Criteria for single (trusted ?) 'primary' data source
 
* Designations for secondary, derived, augmented data sources
 
  
==Wed  10.30-12.30: AQ network: Servers, Catalog, Clients==
+
R. Husar: Following the comments of Terry and Stefano, this is a reminder to our software developer subgroup that identifying and describing their best practices would be highly desirable.
<center>'''Bigagli/Robinson'''</center> <br>
 
  
Preparing the way forward...
+
B.Dominico: The unidata user community made up primarily from meteorologist in academia and other researchers has been interested in accessing real time air quality data. I have been asked to facilitate access to AQ  data so that it could be distributed through Unidata network. I have been approaching EPA and others but my efforts were unsuccessful. I just want to emphasis that there is a community of atmospheric researchers that asks for AQ data.  
  
'''What few things must be the same, so that everything else can be different?'''
+
T. Dye: The need for AQ data may not be that Huge. One of the challenges is that when we install something we have to think about how to run it for the next 10 years. It may be easy for Unidata but not for us. The other point is that we did not quite understand how unidata network operates.
  
Metadata for finding and understanding, CF, ISO)<br>
+
U.Shankar: Here is a statement from the CMAS perspective. It is a rather complicated AQ modelling system. Infrastructure that supports its success typically depend upon community participation that had a major component of education and training.
Data access/use constrains, quality control, data versioning, etc.<br>
 
'''What is the design philosophy'''<br>
 
Service oriented (everything is a service), Component and network design for change; open source (everything?!) <br>
 
'''Network-level data flow, usage statistics (GoogleAnalytics), performance'''<br>
 
... goal is to obtain a good basis for discussion in the following breakout sessions, both from the IT and non-IT sides.
 
* Server Software Design (uFIND). Issues?
 
'''Functionality of an Air Quality Data Network Catalog (ADNC)?'''<br>
 
'''Content and structure (granularity) of ADNC?''' <br>
 
'''Interoperability of ADNC<br>'''
 
* Interoperability with whom? what standards are needed? CF Naming extensions?<br>
 
* AQ Discovery Metadata Convention (for use in ISO, Data Catalogs...)
 
* Extend CF Naming conventions for Point Data
 
* Devise human-readable CF naming equivalents?
 
'''Access rights and access management'''<br>
 
  
'''What are the generic (ISO, GEOSS, INSPIRE) and the AQ-specific discovery metadata?'''<br>
+
P.Kjeld: We have a production environment and we have to make sure that it is not disturbed with anything we do. So we need to separate the data systems for which we need to allocate people and resources. Also, we are not using all open source software but rather commercial, licensed software that is properly maintained so we try to do little development as possible.
'''Minimal metadata for data provenance, quality, access constrains?'''<br>
 
'''Single AQ Catalog? Distributed? Service-oriented?'''<br>
 
* GI-cat
 
* uFind
 
  
==Wed 16.00-18.00: Relationship, cooperation, governance==
+
P.Eckhardt: In response to peter, at NILU our prototype WCS server is running on a separate system. Through replication of the operational database into prototype we can shield the operational system from negative impacts.
<center>''''Nativi/Domenico'''</center> <br>
 
* User perspective, value chain .. user can not find..
 
----
 
* EPA - HTAP Terry Keating??
 
* EEA - H. Anderson??, Peder ??
 
  
 +
Stefano: I think we are trying to solve too bigger problem by seeking a universal solution toward the interoperability. What we should do instead of this philosophical discussion is just decide do we want to be in or out from this activity?. If you are in, keep your constraints ( time, money, personal) in mind and work together to solve the problem. with regards to education, the best way to educate is through example. If this community can demonstrate the benefits of the network system others will jump on it.
  
  
Relationships, cross-thematic links (EGIDA, ESIP)
 
How can we collaborate? 
 
  
 +
17:00 - 18:00 User perspective, value chain.<br>
 +
Relationships, cross-thematic links (EGIDA, ESIP), How can we collaborate? Network governance 
 
* http://www.delicious.com/tag/integratinginitiative+governance
 
* http://www.delicious.com/tag/integratinginitiative+governance
 +
 +
==Wed 18.00-19.00: General Discussion, with wine and cheese==
  
 
<big><center>Group dinner</center></big>
 
<big><center>Group dinner</center></big>
  
==Thu  8.00-10.00: Networking impediments, opportunities==
+
==Thu  8.00-10.00: [[Networking impediments and opportunities]]==
<center>'''Kjeld/Ludewig'''</center>  <br>
+
<center>'''Location: LeoMar, Kjeld/Ludewig'''</center>  
* Clear statements about obstacles..
+
[http://www.egida-project.eu/images/documents/bogardibonn.pdf JJ Bogardi, Global Water System Project (GWSP)] at EGIDA, Bonn:<br>
** no organizational structure,
+
'''Nature of Networking Projects:''' Complex funding; mixture of paid and voluntary; multiple obligations; international; differing project maturity; governance, cultures <br>
** no dedicated funding or
 
** no clear idea yet how to do it (or several independent ideas?).
 
* Opportunities, fixes
 
** Identification of manageable work packages
 
** Reusable components, resources
 
----
 
JJ Bogardi, Global Water System Project (GWSP) at EGIDA, Bonn:<br>
 
'''Nature of Networking Projects:''' Complex funding .., multiple obligations. Interdisciplinary and international. Differing project maturity. Mixture of paid and voluntary contributors. Governance and project cultures may differ. <br>
 
 
'''Which glue keeps it together?''' Trust and personal affinity. Common objectives and scientific values. Mutual respect. Mutual benefit (win-win). Complementarity. Donor dictate <br>
 
'''Which glue keeps it together?''' Trust and personal affinity. Common objectives and scientific values. Mutual respect. Mutual benefit (win-win). Complementarity. Donor dictate <br>
'''„Lethal“ ingredients.''' Turf mentality. Budget discrepancies. Too much competition. Lack of data and information exchange. Donor jealousy<br>
+
'''Lethal“ ingredients.''' Turf mentality. Budget discrepancies. Too much competition. Lack of data and information exchange. Donor jealousy<br>
----
+
 
What can we do to achieve a "win-win" situation?
+
* Clear statements about obstacles: no clear structure; no dedicated funding and no clear idea(s) yet how to do it.
 +
* Opportunities, fixes: ID manageable work packages; Find, organize, distribute reusable components, resources; Create win-win situation
 +
 
 +
==Thu  10.30-12.30: [[AQ Community Metadata Discussion]] ==
 +
<center>'''Location: LeoMar, Galmarini/Dye'''</center>
 +
 
 +
 
 +
 
  
==Thu  10.30-12.30: ADN user relations/help, whom, what?==
 
<center>'''Galmarini/Dye'''</center> '<br>
 
  
 +
----
 
* Target User communities
 
* Target User communities
 
* Use cases for different applications... from scientists to managers, media people and the general public)?
 
* Use cases for different applications... from scientists to managers, media people and the general public)?
 
* How can users find out about (each) system? Big future issues: data quality, traceability, metadata..
 
* How can users find out about (each) system? Big future issues: data quality, traceability, metadata..
  
'''Relationship to non-AComServ WCS servers'''<br>
+
==Thu  16.00-18.00: [[Workshop outputs, outcomes, plans? ]]==
* Issue: protocol compatibility, standard compliance, data format(s)
+
<center>'''Location: LeoMar, 'Schultz/Husar'''</center>
* Issue: is there a need? How can we benefit from "other" data? How can they benefit from AQ data?
+
<center>[http://vimeo.com/28432407 Session recording in MP4 format on Vimeo]</center>
* Issue: which protocols? (OpenDAP?, GIS servers?)
+
'''What are the anticipated outputs?''' Agreement on [[WCS_Server_Software|community WCS server]] for grid and point data; server governance, distributed catalog; workshop summary<br>
 +
'''What are the anticipated outcomes?''' More servers and data added to the shared data pool and more willing users of shared data.<br>
  
==Thu 16.00-18.00: Workshop outputs, outcomes, plans? ==
+
==Thu 18.00-19.00: General Discussion, with wine and cheese==
<center>''''Schultz/Husar'''</center> <br>
 
  
'''What are the anticipated outputs?''' Agreement on [[WCS_Server_Software|community WCS server]] for grid and point data; server governance, distributed catalog; workshop summary<br>
 
'''What are the anticipated outcomes?''' Better understanding of the network, higher level of trust and ''concrete steps toward turning the ADN from virtual to real'' <br>
 
What are the short-term opportunities?  Do we have
 
Common long-term goals and visions? 
 
 
<big><center>Friday: Boat trip</center></big>
 
<big><center>Friday: Boat trip</center></big>

Latest revision as of 10:22, July 28, 2012

< Back to AQ CoP.png | Workshops | Air Quality Data Network

Monday 16:00-18:00: Data Catalog Side Meeting ||| Discussion

Location: LeoMar, Boldrini/Bigagli

GI-cat - ESSI Federated Catalog (aka EuroGEOSS Discovery Broker, part of GEOSS Common Infrastructure)
AQComCat - Air Quality Community Catalog
GI-cat-AQComCat link - Federating AQComCat into GI-cat; Accessing GI-cat from AQComCat

Monday 18:00-19:00 Social

Tue 08.00-10.00: Self-Introduction, 5 mins/participant

Location: LeoMar, Husar/Vidic

Welcome
Logistics
Agenda and Procedures

  • Daily 8:00-12:30; 16:00-19:00; Coffee break and 18:-19:00 (1.5 h) informal interaction
  • Total of ten two-hour sessions.
    • First half focused on data server, catalog software
    • Second half on more general AQ data networking
    • Two rapporteurs for each session

Self-Introduction - by each participant

  • Slides 1-2: Name, institution, relevant research (on interoperability, networking), participation in major projects/programs
  • Slide(s) 3-(4): What would you like to take away from the workshop; what would you like to offer to the workshop (or AQ CoP)

Tue 10.30-12.30: Introduction of Hubs, 5 min each

Location: LeoMar, Schultz/Bernonville

Intro of AQ Data Hubs - by their representatives

  • AQ Community Servers: Common DataFed, FJ Juelich, NGC/CIERA, EBAS
  • Other Servers: DLR/ACP, AIRNow, EEA NRT, AQMEII (AeroCom, RSIG, GIOVANNI)

Air Quality Community of Practice (AQ CoP) - R Husar
Air Quality Community Data Server - M. Schultz

Tue 16.00-18.00: IT Breakout: AQ Community server software

Location: LeoMar, Decker/Hoijarvi

Use of netCDF and other data formats
Gridded data service through WCS
Station-point data service (SQL)
Data server performance issues/solutions; Server co-development tools
Relationship to other WCS servers; Real Data-to-WCS/WFS/WMS-Mapping
Notes

Tue 16.00-18.00: Scope and Type of Data to be Served

Location: LeoMar, Vik/Fialkowski


  • Ensemble System (JRC) Presentation from Stefano Galmarini on AQMEII and ENSEMBLE (Data Hub/Facilitator)
    • Witten in Perl and IDL
    • Comparisons and evaluation of models
    • Coordinated model harmonization across 27 models
    • Also a reposition for the dissemination of model and AQ measurements
    • Produced 4-page tech spec docs that were sent to groups for supplying data; this was key step in making the process smother.
    • Main features
      • Transfer of very large model output across internet
      • Storage of 1,2,3D model data
      • Quick access of large dataset
      • Distribution of KML and WMS (in progress)
    • Use an ASCII format file to describe the model files
    • Have a program call ENFORM (Fortran) that transforms model data into the compressed dataset (1.2GB → 200MB)
    • Can produce GE projected files (nice)
    • 11 papers came out in a special journal edition
  • Data Versioning.

AVik: He suggested that timestamp of submission of data is a useful versioning approach.

    • Need additional flags on observation level
    • Latest time stamp requires to data ...
    • Version dates could confuse broad range of users
      • Martin - Applications would use the version date, not user
    • Who are the next users in this chain?
  • GEO AQ CoP.

AQ CoP

  • RHusar: What is the AQ Community of Practice is/does? CoP Intro PPT
    • Pic of data pool - started discussion with the idea that CoP should only create distributed data pool and build the data network.

SoltaIntro 1.jpeg | SoltaIntro 2.jpeg

      • Discussion w/in group thought this was too narrow and had issues with how the CoP differed from the facilitators.
      • Reworded CoP purpose to CoP should "Connect and Enable" AQ data networks, hubs and facilitators so that they can connect and enable the data needed for their systems.

Data Providers: Existing data Data Hubs - how do they work?
Data Classes

  • by data source-driver (mandated, research)
  • by content/platform (emission, ambient, remsens, model)
  • by space-time (global, regional)

Data Level

  • Primary (original), Secondary, Mediated
  • Raw , processed , how?

Tue 18.00-19.00: General Discussion, with wine and cheese

Wed 8.00-10.00: Breakout reports, general server items

Location: LeoMar, Domenico/Goussev

What few things must be the same, so that everything else can be different?

WCS_Server_Software#WCS_Server_for_Station-Point_Data_Type


Report from the IT breakout session: Community server software
Report from non IT breakout session: ADN scope, providers, users
Report from the pre-workshop Data Catalog side meeting
Crossover topics between IT and non-IT issues

Wed 10.30-12.30: Data network catalog and clients

Location: LeoMar, Bigagli/Eckhardt

Functionality of an Air Quality Data Network Catalog Who , what, where, when
Catalog content and structure (granularity) of ADNC?
Minimal metadata for discovery, data provenance, quality, access constrains?
Single AQ Catalog? Distributed? Service-oriented?, Access rights

  • Ben: Similarity with hydrology
  • Stefano: GI-cat - help on CSW implementation
  • Paul: Catalog has to be linked to the data offering services

Catalog Clients

Wed 16.00-18.00: Relationship, cooperation, governance

Location: LeoMar, Nativi/Robinson
Session recording in MP4 format on Vimeo

3 call-in presentations, (10 mins each max!)

  • 16:00 US EPA - Terry Keating (HTAP, CyAir..) ...perspective on AQ data networking;
    • The contribution from network is mostly to the hemispherical or global scale air pollution but at the same time benefits to the local by specifying what is the contribution of global emissions to the regional AQ.
    • HTAP needs AQ network. In that sense it is a client of this network and facilitates the collaboration in science.
    • If you connect the data systems, using the same channel we can connect scientists/analysts globally
    • AQ Network needs HTAP because it provides focus and application and demonstrate value and creates a demand for investment. Without demonstrating the values in concrete terms it will not be easy to get funding in order to build the infrastructure.
    • Cyber infrastructure focused on the data that are needed and used by EPA. But it should be linked to global system. so far it is in planning mode how to move to inter operable system. It is looking for broader community such as CoP to define standard practices.


  • 16:10 Nat.Park Serv. - Bret Schichtel (VIEWS) on data collection & usage in VIEWS DSS -
  • 16:20 GEO - UIC. Adam Carpenter on GEO Earth Observation Priorities
    • What kinds of end users does your organization represent and/or interact with?
    • What are their specific precipitation data needs? What is that data needed for?
    • Can you provide us documentation?
    • Do you have other feedback or comments?


16:30-17:00 Discussion

M. Schultz: Idea of a case study where we demonstrate how the machinery works with the network and in the absence of network.

Tim Dye : How do we bring in other new comers?pilot projects?

R Husar: Pilots are OK..but if at all possible the pilots are to be driven by user need rather than doing it for just the sake of demonstration.


B Schichtel: What can the CoP contributed network offer to me for AQ analysis?

S. Galmarini: Hypothesizing what possible benefits the networks can have is not very productive because it will depend on those who are participating. Its only after the connections are made, one can observe as to what benefit this network actually created.

T. keating:I think this is exactly the stage that we are at from HTAP perspective. The dream that we had is that we could create a distributed network of modelling information, observational information of various types and emissions information and then be able to compare through web browser based tools be able to access those distributed databases and do analysis comparing models to models , models to observations, models to emissions, emissions to observations etc. I think we are only at a stage now were we develop robustness in connections between some of those distributed databases that now we can begin to build tools to do some of those analyses. I think that the sort of thing that Bret is after is sort of the system we want to be able to develop, to be able to go and access those distributed databases and answer questions. We need to develop analytical applications that can do that.

B Schichtel: Access to data for emissions, models and observations and possibly tools would also be beneficial to my project in the US national park service.

E Robinson:

R Husar: Interoperability has to take place at organization level,data level...(show image)

M.Schultz: Such a collaboration between MACC program and EPA not at program level but also at data sharing level. 10 or 20 years before this would not have been possible.

B.Dominico: I am puzzled what the issues. As I recall, DataFed was available 5 years ago. Community are not using the data?

R Husar: The problem is the slowlessness or the pace at which the AQ networking is evolving... Over the past 5 years businesses have adopted social networking but there was little change in the AQ networking. My feeling is that impediments are not technological. But what are they? Unless we understand the actual impediments, this AQ CoP effort will be just "another wreck along the road" toward interoperable systems. we will address these impediments during the thursday morning session.

B.Dominico: Now I understand some of the issues and also notice that this meeting we have representatives of data systems and technologies but the users of AQ data are not well represented here. 25 years ago when unidata was formed it was the users that got together and declared that some body has to help accessing the data. Driving force is the user community.

P. Kjeld : Yesterday I learned that this community has chosen WCS protocol for delivering AQ data. We will then implement a WCS server to share the EEA datasets. However, EEA serves many communities and some required very different data delivery. So which delivery procedure we should focus on?

M. Schultz: This is why we need compelling demonstration of the connected system. For instance in the DataFed viewer and now at Julich you can take data from multiple sources and overlay those as if they were part of single data system. Benefits of switching to new systems to the operational people are not apparent.

T.Keating: Martin is right. In particular education about the approach of sharing is important. When we talk to people, who are running different data systems their response is that "we are sharing the data". But their concept data sharing is that its downloadable. We do have the technologies but lot of people don't know about it and the approaches by which the web service technologies can be used. Community of practice could do educating. Generate the educational materials.

R. Husar: Stefano Nativi has been a major driver force for introducing informatics as a connector and integrator for earth science activities in Europe. Stefano, what is your perspective?

S. Nativi: I listened to your discussion with considerable interest similar dialogue on why to share and how to share data has been going on bio diversity community with quite successful out comes. They prepared documents, a work plan that outline the activities toward data sharing and integration. It was driven by science. Another contribution of the community is to collect best practices.

R Husar: What you think makes the bio diversity group work so well?

S.Nativi:

R. Husar: Following the comments of Terry and Stefano, this is a reminder to our software developer subgroup that identifying and describing their best practices would be highly desirable.

B.Dominico: The unidata user community made up primarily from meteorologist in academia and other researchers has been interested in accessing real time air quality data. I have been asked to facilitate access to AQ data so that it could be distributed through Unidata network. I have been approaching EPA and others but my efforts were unsuccessful. I just want to emphasis that there is a community of atmospheric researchers that asks for AQ data.

T. Dye: The need for AQ data may not be that Huge. One of the challenges is that when we install something we have to think about how to run it for the next 10 years. It may be easy for Unidata but not for us. The other point is that we did not quite understand how unidata network operates.

U.Shankar: Here is a statement from the CMAS perspective. It is a rather complicated AQ modelling system. Infrastructure that supports its success typically depend upon community participation that had a major component of education and training.

P.Kjeld: We have a production environment and we have to make sure that it is not disturbed with anything we do. So we need to separate the data systems for which we need to allocate people and resources. Also, we are not using all open source software but rather commercial, licensed software that is properly maintained so we try to do little development as possible.

P.Eckhardt: In response to peter, at NILU our prototype WCS server is running on a separate system. Through replication of the operational database into prototype we can shield the operational system from negative impacts.

Stefano: I think we are trying to solve too bigger problem by seeking a universal solution toward the interoperability. What we should do instead of this philosophical discussion is just decide do we want to be in or out from this activity?. If you are in, keep your constraints ( time, money, personal) in mind and work together to solve the problem. with regards to education, the best way to educate is through example. If this community can demonstrate the benefits of the network system others will jump on it.


17:00 - 18:00 User perspective, value chain.
Relationships, cross-thematic links (EGIDA, ESIP), How can we collaborate? Network governance

Wed 18.00-19.00: General Discussion, with wine and cheese

Group dinner

Thu 8.00-10.00: Networking impediments and opportunities

Location: LeoMar, Kjeld/Ludewig

JJ Bogardi, Global Water System Project (GWSP) at EGIDA, Bonn:
Nature of Networking Projects: Complex funding; mixture of paid and voluntary; multiple obligations; international; differing project maturity; governance, cultures
Which glue keeps it together? Trust and personal affinity. Common objectives and scientific values. Mutual respect. Mutual benefit (win-win). Complementarity. Donor dictate
Lethal“ ingredients. Turf mentality. Budget discrepancies. Too much competition. Lack of data and information exchange. Donor jealousy

  • Clear statements about obstacles: no clear structure; no dedicated funding and no clear idea(s) yet how to do it.
  • Opportunities, fixes: ID manageable work packages; Find, organize, distribute reusable components, resources; Create win-win situation

Thu 10.30-12.30: AQ Community Metadata Discussion

Location: LeoMar, Galmarini/Dye




  • Target User communities
  • Use cases for different applications... from scientists to managers, media people and the general public)?
  • How can users find out about (each) system? Big future issues: data quality, traceability, metadata..

Thu 16.00-18.00: Workshop outputs, outcomes, plans?

Location: LeoMar, 'Schultz/Husar
Session recording in MP4 format on Vimeo

What are the anticipated outputs? Agreement on community WCS server for grid and point data; server governance, distributed catalog; workshop summary
What are the anticipated outcomes? More servers and data added to the shared data pool and more willing users of shared data.

Thu 18.00-19.00: General Discussion, with wine and cheese

Friday: Boat trip