Massively LDPC Decoding on Multicore Architectures
Dept. of Electr. & Comput. Eng., University of Coimbra, Coimbra, Portugal
IEEE Transactions on Parallel and Distributed Systems, 2011
@article{falcao2010massively,
title={Massively LDPC Decoding on Multicore Architectures},
author={Falcao, G. and Sousa, L. and Silva, V.},
journal={IEEE Transactions on Parallel and Distributed Systems},
pages={309–322},
year={2010},
publisher={Published by the IEEE Computer Society}
}
Unlike usual VLSI approaches necessary for the computation of intensive Low-Density Parity-Check (LDPC) code decoders, this paper presents flexible software-based LDPC decoders. Algorithms and data structures suitable for parallel computing are proposed in this paper to perform LDPC decoding on multicore architectures. To evaluate the efficiency of the proposed parallel algorithms, LDPC decoders were developed on recent multicores, such as off-the-shelf general-purpose x86 processors, Graphics Processing Units (GPUs), and the CELL Broadband Engine (CELL/B.E.). Challenging restrictions, such as memory access conflicts, latency, coalescence, or unknown behavior of thread and block schedulers, were unraveled and worked out. Experimental results for different code lengths show throughputs in the order of 1 ~ 2 Mbps on the general-purpose multicores, and ranging from 40 Mbps on the GPU to nearly 70 Mbps on the CELL/B.E. The analysis of the obtained results allows to conclude that the CELL/B.E. performs better for short to medium length codes, while the GPU achieves superior throughputs with larger codes. They achieve throughputs that in some cases approach very well those obtained with VLSI decoders. From the analysis of the results, we can predict a throughput increase with the rise of the number of cores.
June 10, 2011 by hgpu