Performance analysis of accelerated image registration using GPGPU

Peter Bui, Jay B. Brockman
University of Notre Dame
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units GPGPU-2


   title={Performance analysis of accelerated image registration using GPGPU},

   author={Bui, P. and Brockman, J.},

   booktitle={Proceedings of 2nd workshop on general purpose processing on graphics processing units},





Download Download (PDF)   View View   Source Source   



This paper presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture (CUDA) programming environment to take advantage of the parallel processing capabilities of NVIDIA’s Tesla C870 GPU. We explain the underlying structure of the GPU implementation and compare its performance and accuracy against a fast CPU-based implementation. Our experimental results demonstrate that our GPU version is capable of up to 90x speedup with bilinear interpolation and 30x speedup with bicubic interpolation while maintaining a high level of accuracy. This compares favorably to recent image registration studies, but it also indicates that our implementation only reaches about 70% of theorectical peak performance. To analyze our results, we utilize profiling data to identify some of the underlying limitations of CUDA that prohibit peak performance. At the end, we emphasize the need to manage memory resources carefully to fully utilize the GPU and obtain maximum speedup.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: