Massively parallel implementation of cyclic LDPC codes on a general purpose graphics processing unit
Sch. of Electr. Eng., Seoul Nat. Univ., Seoul, South Korea
IEEE Workshop on Signal Processing Systems, 2009. SiPS 2009
@conference{ji2009massively,
title={Massively parallel implementation of cyclic LDPC codes on a general purpose graphics processing unit},
author={Ji, H. and Cho, J. and Sung, W.},
booktitle={Signal Processing Systems, 2009. SiPS 2009. IEEE Workshop on},
pages={285–290},
issn={1520-6130},
organization={IEEE}
}
Simulation of low-density parity-check (LDPC) codes frequently takes several days, thus the use of general purpose graphics processing units (GPGPUs) is very promising. However, GPGPUs are designed for compute-intensive applications, and they are not optimized for data caching or control management. In LDPC decoding, the parity check matrix H needs to be accessed at every node updating process, and the size of H matrix is often larger than that of GPU on-chip memory especially when the code-length is long or the weight is high. In this work, the parity check matrix of cyclic or quasi-cyclic LDPC codes is greatly compressed by exploiting the periodic property of the matrix. In our experiments, the Compute Unified Device Architecture (CUDA) of Nvidia is used. With the (1057, 813) and (4161, 3431) projective geometry (PG)-LDPC codes, the execution speed of the proposed method is more than twice of the reference implementations that do not exploit the cyclic property of the parity check matrices.
April 14, 2011 by hgpu