Techniques for efficient DCT/IDCT implementation on generic GPU
Dept. of Information Science & Electronic Eng., Zhejiang University, Hangzhou, P.R.China
Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on In Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on (2005), pp. 1126-1129 Vol. 2.
@conference{fang2005techniques,
title={Techniques for efficient DCT/IDCT implementation on generic GPU},
author={Fang, B. and Shen, G. and Li, S. and Chen, H.},
booktitle={Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on},
pages={1126–1129},
year={2005},
organization={IEEE}
}
The emergence of programmable graphics processing units (GPU) has led to increasing interest in off-loading numerically intensive computations on to graphics hardware. DCT/IDCT is widely adopted in modern image/video compression standards and is usually one of the most computationally expensive parts. We present several techniques for efficient implementation of DCT/IDCT on generic programmable GPU, using direct matrix multiplication. Our experimental results demonstrate that the speed of IDCT on a GPU using the proposed techniques can well exceed that on a CPU with MMX optimization.
October 27, 2010 by hgpu