zfp Arrays: Inline Compression
zfp was designed primarily to serve as a compressed array primitive, and supports both read and write random access to elements of multidimensional arrays in worst-case constant time. zfp partitions d-dimensional arrays into small blocks of 4d values, and these blocks are compressed or decompressed on demand as array accesses are made. Such compression on demand is often referred to as inline compression.
zfp’s compressed arrays allow the user to specify how many bits of compressed storage to allocate to each array. The actual precision supported by such an array depends on how well the data it stores compresses, and may vary spatially from one block to another. Once declared, the user may interact with the compressed array using flat or multidimensional indexing via the usual array index operators, e.g. a[i + nx * j] and a(i, j) both reference a scalar (of type float or double) with index (i, j) in a 2D array of dimensions nx * ny. Through C++ operator overloading, the user may perform mixed read and write computations like a[i] += a[i-1] and need not even know that the array is stored compressed. As such, zfp arrays serve as a drop-in replacement for standard C/C++ arrays and STL vectors, and can often be substituted into applications with minimal code changes.
To limit the number of compression and decompression calls, zfp makes use of a small software cache of uncompressed blocks, which is consulted upon each access. If the block containing the requested array element is not present in the cache, it is fetched, decompressed, and stored in the cache. When a block is evicted from the cache, it is compressed back to permanent storage only if it has been modified.
zfp arrays can be used, for example, to visualize or process very large data sets stored on disk that do not fit in memory, for instance by compressing them on the fly as they are read into memory. Usually only a bit or two per value of compressed storage are needed in visualization applications, allowing for example 3D scalar fields as large as 4K3 doubles = 512 GB uncompressed to be visualized in-core using only 8 GB of RAM.
Another important application of zfp arrays is in-memory storage of state and large tables in numerical simulations. zfp arrays reduce the memory footprint and may even speed up severely memory bandwidth limited applications. The figure below illustrates how zfp arrays are used in a 2D simulation of a shock-bubble interaction. The figure shows snapshots in time (from top to bottom) corresponding to using conventional IEEE 16-bit (half), 32-bit (float), and 64-bit (double) precision representation of the state (e.g. density, pressure, velocity), as well as using zfp arrays with comparable storage (12, 16, 20, 24, and 32 compressed bits/value). The rightmost column shows the full-precision density field, which serves as the gold standard, while the remaining columns show difference images with respect to double precision (with white representing no difference). The values at the bottom represent the normalized (with respect to the density range) root-mean-square error with respect to the double-precision solution at the final time. For the same amount of storage as IEEE, zfp achieves roughly four orders of magnitude higher accuracy. Viewed differently, zfp uses about half the storage as IEEE for the same accuracy. For the simulations below, running with compressed arrays took 2.5 times as long as running with uncompressed floats or doubles, though in read-only applications such as ray tracing we have observed as little as 15% overhead in rendering time using compressed arrays.
Although the zfp compression CODEC is written in C, its array interface is currently available only through C++. Bindings for other languages that do not support operator overloading are currently being developed.