Parallel Compression Checkpointing for Socket-Level Heterogeneous Systems

hgpu.org » Programming » Algorithms » Parallel Compression Checkpointing for Socket-Level Heterogeneous Systems

Parallel Compression Checkpointing for Socket-Level Heterogeneous Systems

Yongpeng Liu, Hong Zhu, Yongyan Liu, Feng Wang, Baohua Fan

School of Computer Science, National University of Defense Technology, Changsha 410073, China

13th IEEE International Conference on High Performance Computing and Communications (HPCC-2011), pp.468-476, 2011

@article{liu2011parallel,

title={Parallel Compression Checkpointing for Socket-Level Heterogeneous Systems},

author={LIU, Y. and ZHU, H. and LIU, Y. and WANG, F. and FAN, B.},

year={2011}

}

Download (PDF)

View

Source

2281

views

Checkpointing is an effective fault tolerant technique to improve the reliability of large scale parallel computing systems. However, checkpointing causes a large number of computation nodes to store a huge amount of data into file system simultaneously. It does not only require a huge storage space to store system state, but also brings a tremendous pressure on the communication network and I/O subsystem because a massive demand of accesses are concentrated in a short period of time. Data compression can reduce the size of checkpoint data to be saved in the file system and to go through the communication network. However, compression induces a huge time overhead especially in large scale parallel systems, which is the main technical barrier of its practical usability. In this paper, we propose a parallel compression checkpointing technique to reduce the time overhead in socket-level heterogeneous architectures. It integrates a number of parallel processing techniques, including transmitting checkpoint data between CPU, GPU and file system in double buffered pipelines, aggregating file write operations, SIMD parallel compression algorithm running on GPU, etc. The paper also reports an implementation of the technique on the Tianhe-1 supercomputer system and the evaluation experiments with the system. The experiment data show that the technique is efficient and practically usable.

Tags: Algorithms, ATI, ATI Radeon HD 4870, Compression, Computer science, Heterogeneous systems, OpenCL

October 22, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org