High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures
Dept. of Computer, National University of Defense Technology, Changsha, China
Radioengineering, 2012
@article{su2012high,
title={High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures},
author={SU, H. and WEN, M. and REN, J. and WU, N. and CHAI, J. and ZHANG, C.},
journal={Radioengineering},
volume={21},
number={1},
pages={47},
year={2012}
}
This article presents two high-efficient parallel realizations of the context-based adaptive variable length coding (CAVLC) based on heterogeneous multicore processors. By optimizing the architecture of the CAVLC encoder, three kinds of dependences are eliminated or weaken, including the context-based data dependence, the memory accessing dependence and the control dependence. The CAVLC pipeline is divided into three stages: two scans, coding, and lag packing, and be implemented on two typical heterogeneous multicore architectures. One is a block-based SIMD parallel CAVLC encoder on multicore stream processor STORM. The other is a component-oriented SIMT parallel encoder on massively parallel architecture GPU. Both of them exploited rich data-level parallelism. Experiments results show that compared with the CPU version, more than 70 times of speedup can be obtained for STORM and over 50 times for GPU. The implementation of encoder on STORM can make a realtime processing for 1080p @30fps and GPU-based version can satisfy the requirements for 720p real-time encoding. The throughput of the presented CAVLC encoders is more than 10 times higher than that of published software encoders on DSP and multicore platforms.
April 23, 2012 by hgpu