Reducing the Disk IO Bandwidth Bottleneck through Fast Floating Point Compression using Accelerators

hgpu.org » Programming » Algorithms » Reducing the Disk IO Bandwidth Bottleneck through Fast Floating Point Compression using Accelerators

Reducing the Disk IO Bandwidth Bottleneck through Fast Floating Point Compression using Accelerators

Ajith Padyana, Devi Sudheer, Pallav Kumar Baruah, Ashok Srinivasan

Department of Mathematics And Computer Science, Sathya Sai Institute of Higher Learning, Muddenahalli Campus, India

International Journal of Advanced Computer Research, Volume 4, Number 1, Issue 14, 2014

@article{padyana2014reducing,

title={Reducing the Disk IO Bandwidth Bottleneck through Fast Floating Point Compression using Accelerators},

author={Padyana, Ajith and Sudheer, Devi and Baruah, Pallav Kumar and Srinivasan, Ashok},

year={2014}

}

Download (PDF)

View

Source

2309

views

Compute-intensive tasks in high-end high performance computing (HPC) systems often generate large amounts of data, especially floating-point data, that need to be transmitted over the network. Although computation speeds are very high, the overall performance of these applications is affected by the data transfer overhead. Moreover, as data sets are growing in size rapidly, bandwidth limitations pose a serious bottleneck in several scientific applications. Fast floating point compression can ameliorate the bandwidth limitations. If data is compressed well, then the amount of data transfer is reduced. This reduction in data transfer time comes at the expense of the increased computation required by compresion and decompression. It is important for compression and decompression rates to be greater than the network bandwidth; otherwise, it will be faster to transmit uncompressed data directly[1]. Accelerators such as Graphics Processing Units (GPU) provide much computational power. In this paper, we show that the computational power of GPUs and CellBE processor can be harnessed to provide sufficiently fast compression and decompression for this approach to be effective for data produced by many practical applications. In particularly, we use Holt`s Exponential smoothing algorithm from time series analysis, and encode the difference between its predictions and the actual data. This yields a lossless compression scheme. We show that it can be implemented efficiently on GPUs and CellBE to provide an effective compression scheme for the purpose of saving on data transfer overheads. The primary contribution of this work lies in demonstrating the potential of floating point compression in reducing the I/O bandwidth bottleneck on modern hardware for important classes of scientific applications.

Tags: Algorithms, Cell processor, Compression, Computer science, CUDA, nVidia, Tesla M2050

April 9, 2014 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org