ndzip-gpu: Efficient Lossless Compression of Scientific Floating-Point Data on GPUs

hgpu.org » Applications » Computer science » ndzip-gpu: Efficient Lossless Compression of Scientific Floating-Point Data on GPUs

ndzip-gpu: Efficient Lossless Compression of Scientific Floating-Point Data on GPUs

Fabian Knorr, Peter Thoman, Thomas Fahringer

University of Innsbruck, Austria

Supercomputing, 2021

BibTeX

Download (PDF)

View

Source

Source codes

Package:

ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data

1730

views

Lossless data compression is a promising software approach for reducing the bandwidth requirements of scientific applications on accelerator clusters without introducing approximation errors. Suitable compressors must be able to effectively compact floating-point data while saturating the system interconnect to avoid introducing unnecessary latencies. We present ndzip-gpu, a novel, highly-efficient GPU parallelization scheme for the block compressor ndzip, which has recently set a new milestone in CPU floating-point compression speeds. Through the combination of intra-block parallelism and efficient memory access patterns, ndzip-gpu achieves high resource utilization in decorrelating multi-dimensional data via the Integer Lorenzo Transform. We further introduce a novel, efficient warp-cooperative primitive for vertical bit packing, providing a high-throughput data reduction and expansion step. Using a representative set of scientific data, we compare the performance of ndzip-gpu against five other, existing GPU compressors. While observing that effectiveness of any compressor strongly depends on characteristics of the dataset, we demonstrate that ndzip-gpu offers the best average compression ratio for the examined data. On Nvidia Turing, Volta and Ampere hardware, it achieves the highest single-precision throughput by a significant margin while maintaining a favorable trade-off between data reduction and throughput in the double-precision case.

Tags: Compression, Computer science, CUDA, nVidia, nVidia GeForce RTX 2070, nVidia GeForce RTX 3090, Package, SYCL, Tesla V100

August 8, 2021 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org