8205

An Optimized Parallel IDCT on Graphics Processing Units

Biao Wang, Mauricio Alvarez-Mesa, Chi Ching Chi, Ben Juurlink
Embedded Systems Architecture, Technische Universitat Berlin, Berlin, Germany
Tenth International Workshop Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar’12), 2012

@article{wang2012optimized,

   title={An Optimized Parallel IDCT on Graphics Processing Units},

   author={Wang, B. and Alvarez-Mesa, M. and Chi, C.C. and Juurlink, B.},

   year={2012}

}

Download Download (PDF)   View View   Source Source   

1877

views

In this paper we present an implementation of the H.264/AVC Inverse Discrete Cosine Transform (IDCT) optimized for Graphics Processing Units (GPUs) using OpenCL. By exploiting that most of the input data of the IDCT for real videos are zero valued coefficients a new compacted data representation is created that allows for several optimizations. Experimental evaluations conducted on different GPUs show average speedups from 1.7x to 7.4x compared to an optimized singlethreaded SIMD CPU version.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: