Difference between revisions of "Talk:Air Quality/Chemistry Naming Conventions"

From Earth Science Information Partners (ESIP)
 
(87 intermediate revisions by 11 users not shown)
Line 1: Line 1:
{{backlink}}
+
{{CF-links}}
<center>'''''General discussion on [[Air Quality/Chemistry Naming Conventions]]. If needed, parctice editing in the [[EsipSandBox| Sandbox]]'''''</center> 
 
{{edithelp}}<br><br><br><br><br><br>
 
  
==CF Naming Extensions==
+
Go to [[Agreed_Items_Air_Quality/Chemistry_Naming_Convention|Agreed Items of Discussion on Air_Quality/Chemistry_Naming_Conventions - General ]].
I am in charge to organise the definition of new names for the CF
 
convention, both as an outcome of our Ispra HTAP meeting as well as by
 
the EU-project GEMS. Do you have any information on what the status of
 
naming conventions for aerosol&chemicals is? I have recently sent an
 
email to Jonathan Gregory and Bryan Lawrence (see below), but did
 
receive a response yet. [[User:ChristianeTextor |ChristianeTextor ]] 19:26, 10 May 2006 (EDT)
 
----
 
:I give you the link to the ACCENT Photocomp [http://www.accent-network.org/farcry_accent/index.cfm?objectid=7C85FF9F-BCDC-BAD1-A5B24AD8463A158A&navid=3F6B52D0-802E-71AA-A15D8BB7251B563A input/output requirements]. Also have a look at the O3 RF. I guess we should discriminate between:
 
  
:a) components (and derived components)
 
:b) fluxes (e.g. O3 deposition, strat-trop, chemical production, chemical destruction)
 
:c) state variables like temperature, pressure, (and derived) radiative forcing
 
:d) model info lon-lat structure, etc. [[User:FrankDetener|FrankDetener]]
 
----
 
:Now CF is quite widely used, it is recognised that more explicit arrangements are needed for governing its development and giving it status and permanence. The original authors and others have been discussing how to do this. Something should be in place before long.
 
  
:As regards the development of standard names for chemistry and aerosol, I would suggest that the work done by Peter van Velthoven for  [http://www.knmi.nl/~velthove/PRISM/CF/guidelines_chemistry_1.3.htm PRISM is a good starting] point. A good deal of thought was invested in that. However, since not many chemical names have so far been added, as you remark, these guidelines are only proposals, not requirements. [[User:JonathanGregory|JonathanGregory]]
+
=Martin Schultz playing the devil's advocate =
----
 
:Thank you for your note on CF naming extensions. Your work will be a very important part of developing interoperability among projects, programs, agencies and countries. Given the formal consensus-based procedures and the broad acceptance of the CF conventions in the met/ocean communities, it is a very attractive model for creating an Air Chemistry extension to the existing Standard Names.
 
  
:In order to collect the information on this topic, we have set up this [[Air Quality/Chemistry Naming Conventions|wiki page]]. It lists the standard air chemistry names in the CF convention (as of April 7, 2006). It is evident that the current list is very limited and it clearly requires expansion to accommodate the needs of this HTAP/GEMS project as well as the needs for other air quality/chemistry-related names.
+
Hi,
  
:I do not have a direct interaction with the CF naming custodians, however I gather from the website that there is a particular e-mail address where naming requests are submitted. The wiki page on air quality/chemistry naming also contains links three additional standard name collections:
+
very good! It is becoming more and more clear to me that a lot of systematic thinking already went into the CF standards (and certainly Jonathan deserves a lot of credit for this). Yet, I am still a bit sceptical whether this can really get acceptance by the large community if they need to adapt so thoroughly and get rid of many old habits and custom units. Microsoft also made ist fortune by challenging the customer with small changes at a time and sacrificing the perfect system for a better chance to drag the crowd along. Translated to our problem at present, I am still wondering if it wouldn't be better to define some non-udunits "interim standards" just to keep people happy. And if they swallow the first bite and implement CF in their models and tools, one can then in a few years time work on making the system more stringent. My concern is also related to the non-existance of suitable evaluation tools which will make good use of all th enice attributes and standard names. More and more I get the impression that we are trying to model too many semantic sophistication into the definitions, which makes it practically impossible to project onto a software code as the complexity of this code must be quite large from the start. Yet another concern is my experience with improper netcdf files. Every error that can be made will be made at some point, and if we rely too much on the meaning of attributes, we are certain to get garbage results quite soon. One can of course implement some checking for consistency etc., but I see it as highly improbably that one will be able to catch all errors, and the system is becoming complex enough that it will be difficult to diagnose an error and correct it. Just two simple examples of what can easily go wrong:
  
:* EPA Air Quality System (AQS)
+
(1) a certain software tool requires the ordering of levels from top to bottom, and thus you need a small program to reverse the order of the hybrid coefficients and all model fields. Since you are under pressure to deliver results, you will not worry about the attributes, and immediately your "direction:up" will be wrong. The file is still a "good" file in the sense that the plotting software can read it and will always display the correct information for a chosen level. Yet, if you want to take advantage of the "direction" attribute, you will be mislead.
:* Supersite Project Naming Standards
 
:* PRISM Project Naming Standards
 
  
:I hope that this information will be of use. Clearly this is a major and thankless undertaking since it is so hard to do it "right" for every one's satisfaction. In order to distribute labor we have set up the above wiki pages where interested work group participants can enter links as well as descriptions through the open wiki process. If you would prefer to maintain such an interactive web page as part of your GEMS project we would be more than happy to make our contributions to those pages. [[User:Rhusar|Rhusar]] 19:30, 10 May 2006 (EDT)
+
(2) assume you have a set of files with accumulated deposition fluxes ("amount" according to the new proposal). For a multi-year average of monthly values, you could for example use ncea from the NCO tools. Hardly anyone will afterwards think about a necessary adaptation of the standard name or cell_methods field (and how would you write this? "mean_of_sum"? impossible for any plotting program to"understand this!).  
----
 
::I think your idea with the wiki pages is great! For exchanging and establishing ideas on standards and tools for the intercomparison work in an interactive way. Thanks a lot for pointing us to this.
 
  
::At this point, in the beginning, it would be really important to choose a representative entry point. Which has some longer life time. which is even not linked too much to an individual organisation. Otherwise a discussion on standards makes no sense. I think also GEMS is not the right thing, since it is a project and by definition will end in some years.  
+
OK: my message is:  
 +
(a) try to keep it simple,  
 +
(b) avoid redundancies,
 +
(c) differentiate between tags for autmated processing and tags for human information,  
 +
(d) provide very clear guidelines as to when a file is CF compliant and which standards are mandatory and which are optional (perhaps one should think about multiple "compliance levels"? level 0 would be the bony basics, level 1 would fulfill a certain set of elements necessary for standard automated processing, level 2 would include all tags amenable for automated processing, and level 3 includes correct tags for human information.
  
::So I have some questions ( also to the colleagues):
+
Don't misunderstand me, please! I am very much interested in seeing this happen (else I wouldnt reply at all). I am only playing the devil's advocate here.
  
::1. Is ESIP the right federation to keep this? (I have not heard of it before)Isnt that purely American?
+
Best regards,
  
::2. Wouldn't it be better to have for example an IGAC administrated wiki page? I must admit that I find it nice to just start and may be we can copy everything to another place once we found it.
+
Martin
  
::3. Who will be the administrator and can create new pages for example in your initial set-up?
+
[[User:Martin Schultz |Martin Schultz]] 4 July 2006 (EDT)
  
::4. Who can change the general outline of these pages? I think there is always too much meta communication on how to edit and who has edited and when etc on these wiki pages.
 
  
::5. Is there an administrator at ESIP who would react within a day or two if we had small wishes? [[User:MichaelSchulz|MichaelSchulz]] 13:21, 11 May 2006 (EDT)
+
=Christiane Textor's answer =
----
 
:::1. This wiki is part of the Earth Science Information Partners Federation, [http://esipfed.org/ ESIP] wiki. We chose ESIP since it is a more neutral place then our [http://datafedwiki.wustl.edu/index.php/DataFed_Wiki DataFed Project wiki]. My colleague, Stefan Falke, and I are facilitating the [[Air_Quality_Cluster|Air Quality Cluster]] within ESIP and we are using this ESIP wiki extensively. However, since ESIP is an American organization, this is not appropriate as a neutral long-term workspace.
 
  
:::We like the [http://www.mediawiki.org/wiki/MediaWiki Mediawiki software] which is the grand daddy of the wikies incl. [http://en.wikipedia.org/wiki/Main_Page Wikipedia]. While it's syntax is a bit arcane, it stands out as a rich and extensible and fast-evolving open-source software.
+
Hi,
  
:::2. IGAC or other international organizations would be much better as neutral and long term hosts for this kind of work. I am sure IGAC would be interested. Last year when I talked to Tim Bates on a similar topic he said that IGAC is interested in this sort of facilitation and he pointed me to Sandro Fuzzi as a further contact.   
+
just a short answer:
:::3. This wiki is fully open for input. Every article page and its associated discussion page can be edited by any participant, not just an administrator. The wiki keeps track and allows recalling all previous versions, accessible through '[http://wiki.esipfed.org/index.php?title=Air_Quality/Chemistry_Naming_Conventions&action=history History]' button. So the issue with the wiki management is which modifications should be restricted to certain users/managers.
 
:::4. To change content, click on "create account or log in" in the upper right corner. To edit article and discussion pages click the edit tab. Practice editing in the <b>[http://wiki.esipfed.org/index.php/EsipSandBox Sandbox]</b>. 
 
:::5. We the community are the administrators. Most of the time we spend "administering" the wiki consists of organizing content, laying out navigation, transferring e-mails to discussion threads. These "management" activities could and should be distributed among appropriate members of the community.[[User:Rhusar|Rhusar]] 18:12, 15 May 2006 (EDT)
 
  
==Wiki Workspace ==
+
1) "interim standard" cannot be called "standard" anymore, we should not create confusion.
I agree that a discussion on the guidelines for chemistry and aerosol names is needed in order to satisfy the needs of as many people as possible. And this is a good start!
 
  
A wiki page for this discussion would be very useful, but we should agree on only one page of the two I am aware of:
+
2) we do not only ask people to do additional work, but also offer a lot of service to them when we analyse their models, this might also make them happy.
1)http://home.badc.rl.ac.uk/lawrence/cf from Bryan Lawrence, or the 2)[[Talk:Air_Quality/Chemistry_Naming_Conventions]] Rudulf Husar has set up this wiki page "Air Quality/Chemistry Naming
 
Conventions".
 
  
As a first step I will now go through the material I gathered, write up a proposal for a list of new names, which can serve as a basis for these discussions, and send it to the wiki page we agreed on. [[User:ChristianeTextor|ChristianeTextor]] 16:05, 15 May 2006 (EDT)
+
3) the non-existance of suitable evaluation tools:
----
+
There are tools existing: I am in contact with people from PCMDI and will probably be able to provide some routines to map standard_names with variable names to be used in the existing analysis tools (like IDL).
:Thanks for bringing us up to date on the on the CF naming for Air Chemistry topic. A few comments and a suggestion.
 
:* It appears that the wiki is an agreeable tool to conduct much of the communication, cooperation and coordination for this work group
 
:* However, as Michael Schulz properly notes, such a work-space (1) should be at a 'neutral' web-space and (2) have  assurance for longevity.
 
:*We are not in position commit to the long-term physical maintenance of the wiki site. Also, we are definitely not equipped to be managers/editors of a wiki contents.   
 
 
:Nevertheless, our view is that progress along the naming conventions is a necessary step toward broader interoperability, the same way as netCDF is for binary data encoding, an OGC Web Coverage Sevice(WCS) is for universal data queries. Within our own small group, we have 4-5 projects and collaborative activities that could benefit from these CF naming extensions. I am sure that most of us could use these conventions well beyond the current HTAP applications. 
 
 
 
:*So, we would like to help bootstrapping this effort primarily through our accumulated tools/methods and experience in wiki-aided collaboration.
 
:*One possibility is to start with an experimental wiki site, and then transferring the contents to the neutral long-term site ASAP.
 
:*With the web-based wiki, 'managing' the contents (whatever that means for a wiki) could be transferred immediately to your group.
 
 
:I am sure there are many viable alternative paths to pursue this, so please consider the above simply as a friendly offer for pooling resources and collaboration. [[User:Rhusar|Rhusar]] 16:08, 15 May 2006 (EDT)
 
----
 
:This is a good initiative and I believe the wiki platform might be a suitable way of collecting the relevant information and opinions. I agree with Rudolf that it is of paramount importance to establish such forum on a "neutral" site and ensure its longevity. Could it be that WMO would be a good place to deal with this? At a recent meeting Len Barrie mentioned that he is always looking for "plums that are ripe to pick", meaning that he would like to see WMO assume this kind of facilitating role, trying to set standards etc. Of course, a potential downside of this may be that an excessive bureaucracy could get involved and make life harder for everyone. Therefore, I would suggest a two-fold strategy: (1) tentatively approach WMO (through Len) whether this kind of initiative could be hosted and maintained there in principle, (2) continue the more informal discussions and soliciting of comments on one of the existing wiki pages for the shorter-term period (i.e. the next year or so).
 
  
:It will also be important to raise the awareness and interest in the modelling community. A good starting point for this could be an IGAC newsletter article quite soon, and an EOS and Eggs article in a few months time. [[User:MartinSchultz|MartinSchultz]] 16:12, 15 May 2006 (EDT)
+
4) I fully agree on the statement "Every error that can be made will be made at some point". But it is independant from the CF conventions, in contrast, CF helps to minimize errors. Of course other tools, like Automod, would do some basic checks if the data are ok (e.g. for the vertical achsis it is enought to check if the pressure is decreasing with height).
----
 
::Hey, Martin, I would agree that if we have to pick one neutral organization, WMO is the largest, most active and (through Len Berry) has a strong internal driving force. I wonder though if there is a way to enlist/engage other neutral orgs, to be neutral-neutral .. multi-neutral :)?? <br> [[image:NeutralOrgs.gif]]<br> ... and doing that without (non-linearly) multiplying the weight of the bureaucracy. Yes, probably impossible in the real world but in cyber-space? Who knows? [[User:Rhusar|Rhusar]] 18:18, 16 May 2006 (EDT) 
 
----
 
:For your information, please [http://datafedwiki.wustl.edu/images/b/b6/060314Schulz_IspraReport.doc find attached a report] from a small workshop held in Ispra in March on cooperation among tracer intercomparisons. (sorry for double posting). Please note:
 
  
:1) that indeed participants agreed to make an IGAC article from that workshop, with the idea to announce standards. And I volunteered to put a small article together. I asked already IGAC and the next possibility is some issue in mid/late autumn. If you wish to contribute to the writing please drop me a short notice.
+
5) I agree that the averaging of "amount" variables could be a problem - but it would not help much to include the time period in the unit (e.g. kg/m2/month). A solution would be to include the averaging period in the variable name.
  
:2) that Christiane Textor volunteered to follow-up (if not co-ordinate a first suggestion) the CF naming requirements for aerosol species and reactive gaseous components. We had the feeling that this should happen in this spring to help in some upcoming intercomparison activities, such as HTAP and AeroCom II.
+
6) keep it simple: a very good idea. but levels of compliance do not seem very simple to me...
  
:3) Hosting/coordinating the CF info and wiki discussion at PCMDI (of course) or WMO would be excellent. It would be nice if some working web solution would be settled soon. Meanwhile I think Rudolfs wiki is a very good place to get it going for the aerosols and reactive components. Maybe there is more out there, then what we are aware of. [[User:MichaelSchulz|MichaelSchulz]] 16:15, 15 May 2006 (EDT)
+
In summary, I think there should be only one standard. The CF names have to goal to be as clear as possible to avoid mistakes, and I feel that our virtual working group is very efficient in fulfilling these requirements: Thanks to you all.
----
 
:PCMDI are in the process of configuring a new set of CF web pages including discussion for a (and possibly task tracking software) ... so in the very near future we expect the management of CF name changes etc to be more than just this mailing list. (Folks will recall that both PCMDI and BADC are contributing effort to try and get CF rolling forward without relying on the contributions of the original authors).
 
  
:Hopefully Kyle is reading the list, and can give us an eta for wider use of the prototype he's got going at the moment.  I think realistically though, Alison (based in the UK) and Kyle (based on the west coast of the US) will need to have a face-to-face chat about how to manage ongoing CF modifications before we get things working really well, and that's planned for mid-June.
+
Best regards,
 +
Christiane
  
:If in the mean-time groups want to use wikis to get ideas sorted, then excellent. [[User:BryanLawrence|BryanLawrence]] 16:09, 15 May 2006 (EDT)
+
[[User:Christiane Textor|Christiane Textor]] 4 July 2006 (EDT)
----
 
::Indeed, the new CF website prototype is up and running at this URL: http://www-pcmdi.llnl.gov/cf It only contains a subset of the content in the current website, but it should provide a good opportunity to give feedback about the direction we're heading. In particular, I'd encourage you to look at the message board system (which would replace the cf-metadata mailing list) and the standard names table:
 
::* http://www-pcmdi.llnl.gov/cf/discussion/message-boards/ || http://www-pcmdi.llnl.gov/cf/documents/cf_standard_names/
 
  
::The website allows members to actively contribute using a Wiki-like content management system. If you're interested in obtaining a username and password to try out the message board and page-editing features, please let me know, and I'll create an account for you.
 
  
::Another feature that you may find to be quite useful is the "live search" capability. If you type some text into the search box in the upper-right corner (try "air pressure", for example) and wait for a few seconds, several search results should pop up beneath the box. This could be a powerful way to search standard names. We don't yet have an ETA for officially switching to the new website, but your feedback, especially in the early stages, is very valuable! [[User:KyleHalliday|KyleHalliday]] 16:19, 15 May 2006 (EDT)
+
=Comments of Jonathan Gregory=
----
 
  
== Place holder ==
+
Dear Christiane and Martin
  
Now CF is quite widely used, it is recognised that more explicit arrangements
+
I agree with Christiane's comments.
are needed for governing its development and giving it status and permanence.
 
The original authors and others have been discussing how to do this. Something
 
should be in place before long.
 
  
As regards the development of standard names for chemistry and aerosol, I would
 
suggest that the work done by Peter van Velthoven for PRISM is a good starting
 
point: http://www.knmi.nl/~velthove/PRISM/CF/guidelines_chemistry_1.3.htm
 
A good deal of thought was invested in that. However, since not many chemical
 
names have so far been added, as you remark, these guidelines are only
 
proposals, not requirements. [[user:JonathanGregory|JonathanGregory]] 17:28, 17 May 2006 (EDT)
 
  
 +
>> 3) the non-existance of suitable evaluation tools:
 +
>> There are tools existing
  
:I would agree that if we have to pick one neutral organization, WMO is the largest, most active and (through Len Berry) has a strong internal driving force. I wonder though if there is a way to enlist/engage other neutral orgs, to be neutral-neutral .. multi-neutral :)??  
+
In particular there is the CF-checker, which verifies conformance to the
:http://wiki.esipfed.org/images/d/da/NeutralOrgs.gif ... and doing that without (non-linearly) multiplying the weight of the bureaucracy. Yes, probably impossible in the real world but in cyber-space? Who knows? [[User:Rhusar|Rhusar]] 17:30, 17 May 2006 (EDT)
+
standard in a "syntactic" sense, as specified by the conformance document
 +
http://www.cgd.ucar.edu/cms/eaton/cf-metadata/conformance-req.html
 +
There is also the CMOR F90 library written by PCMDI to
 +
help people write CF-compliant netCDF more easily.
 +
 
 +
 
 +
>> 5) I agree that the averaging of "amount" variables could be a problem -
 +
 
 +
This problem comes up with other amount variables too, like precipitation.
 +
One solution could be to recognise that if you are averaging them, you are
 +
maybe treating them as a rate, not an amount. Hence the standard name should
 +
be the one of rate, not amount. The unit does not have to be kg m-2 s-1. It
 +
could be kg m-2 day-1, for instance. Although udunits allows "month" we don't
 +
recommend it because its definition is not a calendar month but a particular
 +
(constant) number of seconds - probably not what you want. But the time bounds
 +
of the variable should always indicate the meaning period.
 +
 
 +
With both rates and amounts, climatological time bounds may help as well,
 +
with which you can record that it is (for instance) the January mean over a
 +
number of years (see CF 7.4). I hope that
 +
tools such as nco may be extended to produce the cell_methods attribute to
 +
describe this, since CF is becoming quite important; in fact we could request
 +
such an extension.
 +
 
 +
 
 +
>> (1) a certain software tool requires the ordering of levels from top to  
 +
>> bottom, and thus you need a small program to reverse the order of the
 +
>> hybrid coefficients and all model fields. Since you are under pressure
 +
>> to deliver results, you will not worry about the attributes, and
 +
>> immediately your "direction:up" will be wrong. The file is still a
 +
>> "good" file
 +
 
 +
positive is not affected by the ordering of the coordinates. It indicates only
 +
whether larger or smaller values mean up or down.
 +
 
 +
 
 +
>> (a) try to keep it simple, (b) avoid redundancies
 +
 
 +
Both of these are principles of CF. See
 +
http://www.cgd.ucar.edu/cms/eaton/cf-metadata/clivar_article.pdf
 +
 
 +
>> (c) differentiate between tags for autmated processing and
 +
>> tags for human information
 +
 
 +
We try to provide both at once i.e. metadata which is precise for programs but
 +
also intelligible to humans. This minimises redundancy.
 +
 
 +
 
 +
>> (d) provide very clear guidelines as to when
 +
>> a file is CF compliant and which standards are mandatory and which are
 +
>> optional
 +
 
 +
There is only kind of compliance defined at present, because most features of
 +
CF (beyond COARDS) are optional, for backward compatibility. But in a
 +
particular application or project you could of course insist on certain
 +
features or choices within the standard.
 +
 
 +
Thanks for your comments. Best wishes
 +
 
 +
Jonathan
 +
 
 +
[[User:Jonathan Gregory|Jonathan Gregory]] 5 July 2006 (EDT)
 +
 
 +
=Mail to Jonathan Gregory on July 10, Christiane Textor, answers from July 11 =
 +
'''some of the items of the original mail appear in other discussions to which they pertain'''
 +
==CF-COMPLIANCE CHECKER==
 +
'''CT:''' After discussion with HTAP people, Martin Schultz in particular, I feel that it would be nice, if the compliance checker could be more informative. It would be helpful to obtain more information on why compliance is not reached. Is this possible?
 +
 
 +
'''JG:''' It depends on software engineering effort. Of course, I agree with you. Do you have any effort available from your project? I would hope that either Kyle or Alison might be able to work on this at some point, but at present both of them are still learning about CF, so I don't foresee any immediate help.
 +
 
 +
==CF HOME PAGE AND DOCUMENTATION==
 +
'''CT:''' It would be good to have a very simple and short summary of what the main objectives of CF and what CF-compliance means. The documents provided on line are not that easy to understand, and there are many of them, I found 6 relevant links, see
 +
http://wiki.esipfed.org/index.php/Air_Quality/Chemistry_Naming_Resources
 +
Would it be possible to have one simple and short version for CF-beginners?
 +
 
 +
'''JG:''' Again, this would depend on someone else being spun-up enough to write it. Do you think you or anyone else might produce a draft? http://www.cgd.ucar.edu/cms/eaton/cf-metadata/clivar_article.pdf (on the CF home page) is supposed to be something anyone could understand, and it states the objectives of CF near the start. Is this doc too complicated?
 +
 
 +
'''CT:'''  I have still some questions, for example in section 4 of the http://www.cgd.ucar.edu/cms/eaton/cf-metadata/CF-current.html document, no standard_names are given, instead long_names are used. Do I misunderstand something?
 +
 
 +
'''JG:''' No. Some of the examples were written before standard names were defined. Standard names are optional, though.
 +
 
 +
==nco ==
 +
'''CT:''' ... nco is essential for us! ...
 +
 
 +
'''JG:''' I don't expect that nco will recognise the cell_measures, if you mean use it
 +
to do global sums etc. However your own analysis software could use this,
 +
couldn't it. If you think nco should be extended, please write to them to ask
 +
for it. They are aware of CF.

Latest revision as of 09:43, January 9, 2007

Return to Start page for Atmospheric Chemistry and Aerosol Names PLEASE DO NOT USE THE NAVIGATION BAR ON THE LEFT HAND SIDE!


Go to Agreed Items of Discussion on Air_Quality/Chemistry_Naming_Conventions - General .


Martin Schultz playing the devil's advocate

Hi,

very good! It is becoming more and more clear to me that a lot of systematic thinking already went into the CF standards (and certainly Jonathan deserves a lot of credit for this). Yet, I am still a bit sceptical whether this can really get acceptance by the large community if they need to adapt so thoroughly and get rid of many old habits and custom units. Microsoft also made ist fortune by challenging the customer with small changes at a time and sacrificing the perfect system for a better chance to drag the crowd along. Translated to our problem at present, I am still wondering if it wouldn't be better to define some non-udunits "interim standards" just to keep people happy. And if they swallow the first bite and implement CF in their models and tools, one can then in a few years time work on making the system more stringent. My concern is also related to the non-existance of suitable evaluation tools which will make good use of all th enice attributes and standard names. More and more I get the impression that we are trying to model too many semantic sophistication into the definitions, which makes it practically impossible to project onto a software code as the complexity of this code must be quite large from the start. Yet another concern is my experience with improper netcdf files. Every error that can be made will be made at some point, and if we rely too much on the meaning of attributes, we are certain to get garbage results quite soon. One can of course implement some checking for consistency etc., but I see it as highly improbably that one will be able to catch all errors, and the system is becoming complex enough that it will be difficult to diagnose an error and correct it. Just two simple examples of what can easily go wrong:

(1) a certain software tool requires the ordering of levels from top to bottom, and thus you need a small program to reverse the order of the hybrid coefficients and all model fields. Since you are under pressure to deliver results, you will not worry about the attributes, and immediately your "direction:up" will be wrong. The file is still a "good" file in the sense that the plotting software can read it and will always display the correct information for a chosen level. Yet, if you want to take advantage of the "direction" attribute, you will be mislead.

(2) assume you have a set of files with accumulated deposition fluxes ("amount" according to the new proposal). For a multi-year average of monthly values, you could for example use ncea from the NCO tools. Hardly anyone will afterwards think about a necessary adaptation of the standard name or cell_methods field (and how would you write this? "mean_of_sum"? impossible for any plotting program to"understand this!).

OK: my message is: (a) try to keep it simple, (b) avoid redundancies, (c) differentiate between tags for autmated processing and tags for human information, (d) provide very clear guidelines as to when a file is CF compliant and which standards are mandatory and which are optional (perhaps one should think about multiple "compliance levels"? level 0 would be the bony basics, level 1 would fulfill a certain set of elements necessary for standard automated processing, level 2 would include all tags amenable for automated processing, and level 3 includes correct tags for human information.

Don't misunderstand me, please! I am very much interested in seeing this happen (else I wouldnt reply at all). I am only playing the devil's advocate here.

Best regards,

Martin

Martin Schultz 4 July 2006 (EDT)


Christiane Textor's answer

Hi,

just a short answer:

1) "interim standard" cannot be called "standard" anymore, we should not create confusion.

2) we do not only ask people to do additional work, but also offer a lot of service to them when we analyse their models, this might also make them happy.

3) the non-existance of suitable evaluation tools: There are tools existing: I am in contact with people from PCMDI and will probably be able to provide some routines to map standard_names with variable names to be used in the existing analysis tools (like IDL).

4) I fully agree on the statement "Every error that can be made will be made at some point". But it is independant from the CF conventions, in contrast, CF helps to minimize errors. Of course other tools, like Automod, would do some basic checks if the data are ok (e.g. for the vertical achsis it is enought to check if the pressure is decreasing with height).

5) I agree that the averaging of "amount" variables could be a problem - but it would not help much to include the time period in the unit (e.g. kg/m2/month). A solution would be to include the averaging period in the variable name.

6) keep it simple: a very good idea. but levels of compliance do not seem very simple to me...

In summary, I think there should be only one standard. The CF names have to goal to be as clear as possible to avoid mistakes, and I feel that our virtual working group is very efficient in fulfilling these requirements: Thanks to you all.

Best regards, Christiane

Christiane Textor 4 July 2006 (EDT)


Comments of Jonathan Gregory

Dear Christiane and Martin

I agree with Christiane's comments.


>> 3) the non-existance of suitable evaluation tools: >> There are tools existing

In particular there is the CF-checker, which verifies conformance to the standard in a "syntactic" sense, as specified by the conformance document http://www.cgd.ucar.edu/cms/eaton/cf-metadata/conformance-req.html There is also the CMOR F90 library written by PCMDI to help people write CF-compliant netCDF more easily.


>> 5) I agree that the averaging of "amount" variables could be a problem -

This problem comes up with other amount variables too, like precipitation. One solution could be to recognise that if you are averaging them, you are maybe treating them as a rate, not an amount. Hence the standard name should be the one of rate, not amount. The unit does not have to be kg m-2 s-1. It could be kg m-2 day-1, for instance. Although udunits allows "month" we don't recommend it because its definition is not a calendar month but a particular (constant) number of seconds - probably not what you want. But the time bounds of the variable should always indicate the meaning period.

With both rates and amounts, climatological time bounds may help as well, with which you can record that it is (for instance) the January mean over a number of years (see CF 7.4). I hope that tools such as nco may be extended to produce the cell_methods attribute to describe this, since CF is becoming quite important; in fact we could request such an extension.


>> (1) a certain software tool requires the ordering of levels from top to >> bottom, and thus you need a small program to reverse the order of the >> hybrid coefficients and all model fields. Since you are under pressure >> to deliver results, you will not worry about the attributes, and >> immediately your "direction:up" will be wrong. The file is still a >> "good" file

positive is not affected by the ordering of the coordinates. It indicates only whether larger or smaller values mean up or down.


>> (a) try to keep it simple, (b) avoid redundancies

Both of these are principles of CF. See http://www.cgd.ucar.edu/cms/eaton/cf-metadata/clivar_article.pdf

>> (c) differentiate between tags for autmated processing and >> tags for human information

We try to provide both at once i.e. metadata which is precise for programs but also intelligible to humans. This minimises redundancy.


>> (d) provide very clear guidelines as to when >> a file is CF compliant and which standards are mandatory and which are >> optional

There is only kind of compliance defined at present, because most features of CF (beyond COARDS) are optional, for backward compatibility. But in a particular application or project you could of course insist on certain features or choices within the standard.

Thanks for your comments. Best wishes

Jonathan

Jonathan Gregory 5 July 2006 (EDT)

Mail to Jonathan Gregory on July 10, Christiane Textor, answers from July 11

some of the items of the original mail appear in other discussions to which they pertain

CF-COMPLIANCE CHECKER

CT: After discussion with HTAP people, Martin Schultz in particular, I feel that it would be nice, if the compliance checker could be more informative. It would be helpful to obtain more information on why compliance is not reached. Is this possible?

JG: It depends on software engineering effort. Of course, I agree with you. Do you have any effort available from your project? I would hope that either Kyle or Alison might be able to work on this at some point, but at present both of them are still learning about CF, so I don't foresee any immediate help.

CF HOME PAGE AND DOCUMENTATION

CT: It would be good to have a very simple and short summary of what the main objectives of CF and what CF-compliance means. The documents provided on line are not that easy to understand, and there are many of them, I found 6 relevant links, see http://wiki.esipfed.org/index.php/Air_Quality/Chemistry_Naming_Resources Would it be possible to have one simple and short version for CF-beginners?

JG: Again, this would depend on someone else being spun-up enough to write it. Do you think you or anyone else might produce a draft? http://www.cgd.ucar.edu/cms/eaton/cf-metadata/clivar_article.pdf (on the CF home page) is supposed to be something anyone could understand, and it states the objectives of CF near the start. Is this doc too complicated?

CT: I have still some questions, for example in section 4 of the http://www.cgd.ucar.edu/cms/eaton/cf-metadata/CF-current.html document, no standard_names are given, instead long_names are used. Do I misunderstand something?

JG: No. Some of the examples were written before standard names were defined. Standard names are optional, though.

nco

CT: ... nco is essential for us! ...

JG: I don't expect that nco will recognise the cell_measures, if you mean use it to do global sums etc. However your own analysis software could use this, couldn't it. If you think nco should be extended, please write to them to ask for it. They are aware of CF.