Accelerating tetrahedral interpolation with data-level and Thread-Level Parallel optimization
School of Electrical Engineering, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 151-742, Korea
10th International Symposium on Signals, Circuits and Systems (ISSCS), 2011
@inproceedings{ahn2011accelerating,
title={Accelerating tetrahedral interpolation with data-level and Thread-Level Parallel optimization},
author={Ahn, J. and Seong, B. and Sung, W.},
booktitle={Signals, Circuits and Systems (ISSCS), 2011 10th International Symposium on},
pages={1–4},
year={2011},
organization={IEEE}
}
The tetrahedral interpolation method for color space conversion consumes the longest time in the entire color management process. This makes it difficult to implement a purely software-based high-end image processing system. In this study, SIMD (Single Instruction Multiple Data) and GPGPU (General Purpose Graphics Processing Unit) based optimizations for tetrahedral interpolation are implemented. To exploit DLP (Data-Level Parallelism) with SIMD extensions, the program is restructured and conditional branches are removed so that inter-pixel parallelism is used for tetrahedron determination, while inter-output-channel parallelism is employed for the table lookup and weighted sum. TLP (Thread-Level Parallelism) is exploited with GPGPU by allocating different input pixels to each thread. Memory access cycle is minimized using constant memory for color lookup table. We conclude that both DLP and TLP optimization is essential for recent multi-core CPUs with wider SIMD registers and reducing the communication overhead between the host and the device is critical for TLP optimization with GPGPUs.
August 26, 2011 by hgpu