9 NetCDF File Structure and Performance

This chapter describes the file structure of a netCDF dataset in enough detail to aid in understanding netCDF performance issues.

NetCDF is a data abstraction for array-oriented data access and a software library that provides a concrete implementation of the interfaces that support that abstraction. The implementation provides a machine-independent format for representing arrays. Although the netCDF file format is hidden below the interfaces, some understanding of the current implementation and associated file structure may help to make clear why some netCDF operations are more expensive than others.

For a detailed description of the netCDF format, see Appendix B "File Format Specification," page 115. Knowledge of the format is not needed for reading and writing netCDF data or understanding most efficiency issues. Programs that use only the documented interfaces and that make no assumptions about the format will continue to work even if the netCDF format is changed in the future, because any such change will be made below the documented interfaces and will support earlier versions of the netCDF file format.

9.1 Parts of a NetCDF File

A netCDF dataset is stored as a single file comprising two parts:

a header, containing all the information about dimensions, attributes, and variables except for the variable data;
a data part, comprising fixed-size data, containing the data for variables that don't have an unlimited dimension; and variable-size data, containing the data for variables that have an unlimited dimension.

Both the header and data parts are represented in a machine-independent form. This form is very similar to XDR (eXternal Data Representation), extended to support efficient storage of arrays of non-byte data.

The header at the beginning of the file contains information about the dimensions, variables, and attributes in the file, including their names, types, and other characteristics. The information about each variable includes the offset to the beginning of the variable's data for fixed-size variables or the relative offset of other variables within a record. The header also contains dimension lengths and information needed to map multidimensional indices for each variable to the appropriate offsets.

This header has no usable extra space; it is only as large as it needs to be for the dimensions, variables, and attributes (including all the attribute values) in the netCDF dataset. This has the advantage that netCDF files are compact, requiring very little overhead to store the ancillary data that makes the datasets self-describing. A disadvantage of this organization is that any operation on a netCDF dataset that requires the header to grow (or, less likely, to shrink), for example adding new dimensions or new variables, requires moving the data by copying it. This expense is incurred when nc_enddef is called, after a previous call to nc_redef. If you create all necessary dimensions, variables, and attributes before writing data, and avoid later additions and renamings of netCDF components that require more space in the header part of the file, you avoid the cost associated with later changing the header.

When the size of the header is changed, data in the file is moved, and the location of data values in the file changes. If another program is reading the netCDF dataset during redefinition, its view of the file will be based on old, probably incorrect indexes. If netCDF datasets are shared across redefinition, some mechanism external to the netCDF library must be provided that prevents access by readers during redefinition, and causes the readers to call nc_sync before any subsequent access.

The fixed-size data part that follows the header contains all the variable data for variables that do not employ an unlimited dimension. The data for each variable is stored contiguously in this part of the file. If there is no unlimited dimension, this is the last part of the netCDF file.

The record-data part that follows the fixed-size data consists of a variable number of fixed-size records, each of which contains data for all the record variables. The record data for each variable is stored contiguously in each record.

The order in which the variable data appears in each data section is the same as the order in which the variables were defined, in increasing numerical order by netCDF variable ID. This knowledge can sometimes be used to enhance data access performance, since the best data access is currently achieved by reading or writing the data in sequential order.

9.2 The Extended XDR Layer

XDR is a standard for describing and encoding data and a library of functions for external data representation, allowing programmers to encode data structures in a machine-independent way. NetCDF employs an extended form of XDR for representing information in the header part and the data parts. This extended XDR is used to write portable data that can be read on any other machine for which the library has been implemented.

The cost of using a canonical external representation for data varies according to the type of data and whether the external form is the same as the machine's native form for that type.

For some data types on some machines, the time required to convert data to and from external form can be significant. The worst case is reading or writing large arrays of floating-point data on a machine that does not use IEEE floating-point as its native representation.

9.3 The I/O Layer

An I/O layer implemented much like the C standard I/O (stdio) library is used by netCDF to read and write portable data to netCDF datasets. Hence an understanding of the standard I/O library provides answers to many questions about multiple processes accessing data concurrently, the use of I/O buffers, and the costs of opening and closing netCDF files. In particular, it is possible to have one process writing a netCDF dataset while other processes read it. Data reads and writes are no more atomic than calls to stdio fread() and fwrite(). An nc_sync call is analogous to the fflush call in the C standard I/O library, writing unwritten buffered data so other processes can read it; nc_sync also brings header changes up-to-date (for example, changes to attribute values). NC_SHARE is analogous to setting a stdio stream to be unbuffered with the _IONBF flag to setvbuf.

As in the stdio library, flushes are also performed when "seeks" occur to a different area of the file. Hence the order of read and write operations can influence I/O performance significantly. Reading data in the same order in which it was written within each record will minimize buffer flushes.

You should not expect netCDF data access to work with multiple writers having the same file open for writing simultaneously.

It is possible to tune an implementation of netCDF for some platforms by replacing the I/O layer with a different platform-specific I/O layer. This may change the similarities between netCDF and standard I/O, and hence characteristics related to data sharing, buffering, and the cost of I/O operations.

The distributed netCDF implementation is meant to be portable. Platform-specific ports that further optimize the implementation for better I/O performance are practical in some cases.

9.4 UNICOS Optimization

As was mentioned in the previous section, it is possible to replace the I/O layer in order to increase I/O efficiency. This has been done for UNICOS, the operating system of Cray computers similar to the Cray Y-MP.

Additionally, it is possible for the user to obtain even greater I/O efficiency through appropriate setting of the NETCDF_FFIOSPEC environment variable. This variable specifies the Flexible File I/O buffers for netCDF I/O when executing under the UNICOS operating system (the variable is ignored on other operating systems). An appropriate specification can greatly increase the efficiency of netCDF I/O--to the extent that it can surpass default FORTRAN binary I/O. Possible specifications include the following:
bufa:336:2 2, asynchronous, I/O buffers of 336 blocks each (i.e., double buffering). This is the default specification and favors sequential I/O.
cache:256:8 8, synchronous, 256-block buffers. This favors larger random accesses.
cachea:256:8:2 8, asynchronous, 256-block buffers with a 2 block read-ahead/write-behind factor. This also favors larger random accesses.
cachea:8:256:0 256, asynchronous, 8-block buffers without read-ahead/write-behind. This favors many smaller pages without read-ahead for more random accesses as typified by slicing netCDF arrays.
cache:8:256,cachea.sds:1024:4:1 This is a two layer cache. The first (synchronous) layer is composed of 256 8-block buffers in memory, the second (asynchronous) layer is composed of 4 1024-block buffers on the SSD. This scheme works well when accesses proceed through the dataset in random waves roughly 2x1024-blocks wide.

All of the options/configurations supported in CRI's FFIO library are available through this mechanism. We recommend that you look at CRI's I/O optimization guide for information on using FFIO to it's fullest. This mechanism is also compatible with CRI's EIE I/O library.

Tuning the NETCDF_FFIOSPEC variable to a program's I/O pattern can dramatically improve performance. Speedups of two orders of magnitude have been seen.

9.1 - Parts of a NetCDF File
9.2 - The Extended XDR Layer
9.3 - The I/O Layer
9.4 - UNICOS Optimization

NetCDF User's Guide for C - 5 JUN 1997

[Next] [Previous] [Top] [Contents] [Index] [netCDF Home Page][Unidata Home Page]

`bufa:336:2`	2, asynchronous, I/O buffers of 336 blocks each (i.e., double buffering). This is the default specification and favors sequential I/O.
`cache:256:8`	8, synchronous, 256-block buffers. This favors larger random accesses.
`cachea:256:8:2`	8, asynchronous, 256-block buffers with a 2 block read-ahead/write-behind factor. This also favors larger random accesses.
`cachea:8:256:0`	256, asynchronous, 8-block buffers without read-ahead/write-behind. This favors many smaller pages without read-ahead for more random accesses as typified by slicing netCDF arrays.
`cache:8:256,cachea.sds:1024:4:1`	This is a two layer cache. The first (synchronous) layer is composed of 256 8-block buffers in memory, the second (asynchronous) layer is composed of 4 1024-block buffers on the SSD. This scheme works well when accesses proceed through the dataset in random waves roughly 2x1024-blocks wide.