Creating NetCDF CF Files

From Earth Science Information Partners (ESIP)

UNDER CONSTRUCTION


Back to WCS Access to netCDF Files

Back to WCS NetCDF Development

NetCDF-CF Convention

The reason to use CF convention: Enable plug and play connectivity.

Well done NetCDF files are human readable. After all: what could dimension longitude mean besides longitude. If you get data in NetCDF format, it's usually fairly easy to see what really is there.

It's also easy to write a generic browser, that can display every variable for you.

But since a lot of data in NetCDF files have geographical meaning, a graphical viewer should be able to draw the data ion the map, on it's own. This involves, at minimum:

  • finding the three geographical dimensions
  • Finding time dimension, if any
  • Understanding the geographical projection

From any generic NetCDF, this requires human intelligence. After all, the n-dimensional data variables, dimensional variables and other metadata variables look precisely the same for the program code. There are legion of ways to code projection information, and decoding it reliably is very difficult.

Conventions come to rescue. For example, if a variable has attribute axis='X', there's only one interpretation for the values of this variable: it must have just one dimension, and the values are points on X-axis along that dimension. No more guesswork, and wrong guesses, for the programmers.


The best documentation is at CF Metadata page.

Example

In short, a simple ncdump output for a NetCDF-CF file may look like this:

netcdf TOMS_AI_58 {
dimensions:
	time = 1 ;
	lat = 3 ;
	lon = 4 ;
variables:
	double time(time) ;
		time:standard_name = "time" ;
		time:long_name = "time" ;
		time:units = "days since 1979-01-01" ;
		time:axis = "T" ;
	double lat(lat) ;
		lat:standard_name = "latitude" ;
		lat:long_name = "latitude" ;
		lat:units = "degrees_north" ;
		lat:axis = "Y" ;
	double lon(lon) ;
		lon:standard_name = "longitude" ;
		lon:long_name = "longitude" ;
		lon:units = "degrees_east" ;
		lon:axis = "X" ;
	byte AI(time, lat, lon) ;
		AI:long_name = "Aerosol Index" ;
		AI:units = "fraction" ;
		AI:_FillValue = -1b ;
		AI:missing_value = -1b ;

// global attributes:
		:title = "NASA TOMS Project" ;
		:comment = "NASA Total Ozone Mapping Spectrometer Project" ;
		:Conventions = "CF-1.0" ;
data:

 time = 9952 ;

 lat = 32.5, 33.5, 34.5 ;

 lon = -89.375, -88.125, -86.875, -85.625 ;

 AI =
  0, 2, 1, 2,
  _, 2, 3, 2,
  1, 4, 4, 2 ;
}

Let's go over section by section:

Dimensions

dimensions:
	time = 1 ;
	lat = 3 ;
	lon = 4 ;
}

These names can be anything. The reason is, that sometimes you may want to store two grids, that have different dimensions, into the same file. In that case you could name dimensions lat1, lat2 etc...

Time Dimension Variable

	double time(time) ;
		time:standard_name = "time" ;
		time:long_name = "time" ;
		time:units = "days since 1979-01-01" ;
		time:axis = "T" ;

It's the attribute axis = "T" marks this variable as time dimension.

Latitude and Longitude Dimension Variables

	double lat(lat) ;
		lat:standard_name = "latitude" ;
		lat:long_name = "latitude" ;
		lat:units = "degrees_north" ;
		lat:axis = "Y" ;
	double lon(lon) ;
		lon:standard_name = "longitude" ;
		lon:long_name = "longitude" ;
		lon:units = "degrees_east" ;
		lon:axis = "X" ;

Again, it's the attributes axis = "Y" and axis = "X" that mark grographical dimension variables. Since the linear projection is acknowledged with standard names.

Data Variable

	byte AI(time, lat, lon) ;
		AI:long_name = "Aerosol Index" ;
		AI:units = "fraction" ;
		AI:_FillValue = -1b ;
		AI:missing_value = -1b ;

The dimensions are marked with axis attributes, so this is a data variable. The units = "fraction" is not standard, and therefore the compliance checker reports it as an error.

Global Attributes

		:title = "NASA TOMS Project" ;
		:comment = "NASA Total Ozone Mapping Spectrometer Project" ;
		:Conventions = "CF-1.0" ;

Only Conventions = CF-1.x is required.

Verifying NetCDF-CF Files

Since CF-1.0 conventions contain a lot of definitions, verifying them by machine is necessary. There is a fairly complete compliance checker online. They have some NetCDF documentation and CF convention documentation online too. The latest compliance checker is here, it lets you upload a NetCDF file and does a wide range of checks.

TODO: this is just a standard form with http-post, and therefore it should be easy to use it as a service from python. The owsadmin tool should clone a subset of a netcdf cube and submit it.

Creating NetCDF-CF files

There are a few ways to create a NetCDF file. In general, it's much easier to create the empty file with a descriptive method, and use a programming language to fill in the data.

Use CDL and ncgen

The CDL text you see above is fairly readable, and ncgen can turn it into a NetCDF file.

Use NCML

NCML, NetCDF Markup Language is an XML language. The designers provide some tools, and Datafed OWS system supports creating and verifying files with it.

After installing Windows or Linux you have the datafed.nc3 and datafed.cf1 modules.

Creating and Filling NetCDF Files using NCML and python

Here's a test ncml: CMAQ_Baron_20.ncml

from datafed import nc3, cf1

cf1.create_ncml22("CMAQ_Baron_20.nc", "CMAQ_Baron_20.ncml")

Creating NetCDF-CF files using NetCDF Markup Language

NCML utlilities here.