Non-rigid multi-modal registration on the GPU
Siemens Corporate Research
Medical Imaging 2007: Image Processing, Vol. 6512, No. 1. (2007)
@conference{vetter2007non,
title={Non-rigid multi-modal registration on the GPU},
author={Vetter, C. and Guetter, C. and Xu, C. and Westermann, R.},
booktitle={Proceedings of SPIE},
volume={6512},
pages={651228},
year={2007}
}
Non-rigid multi-modal registration of images/volumes is becoming increasingly necessary in many medical settings. While efficient registration algorithms have been published, the speed of the solutions is a problem in clinical applications. Harnessing the computational power of graphics processing unit (GPU) for general purpose computations has become increasingly popular in order to speed up algorithms further, but the algorithms have to be adapted to the data-parallel, streaming model of the GPU. This paper describes the implementation of a non-rigid, multi-modal registration using mutual information and the Kullback-Leibler divergence between observed and learned joint intensity distributions. The entire registration process is implemented on the GPU, including a GPU-friendly computation of two-dimensional histograms using vertex texture fetches as well as an implementation of recursive Gaussian filtering on the GPU. Since the computation is performed on the GPU, interactive visualization of the registration process can be done without bus transfer between main memory and video memory. This allows the user to observe the registration process and to evaluate the result more easily. Two hybrid approaches distributing the computation between the GPU and CPU are discussed. The first approach uses the CPU for lower resolutions and the GPU for higher resolutions, the second approach uses the GPU to compute a first approximation to the registration that is used as starting point for registration on the CPU using double-precision. The results of the CPU implementation are compared to the different approaches using the GPU regarding speed as well as image quality. The GPU performs up to 5 times faster per iteration than the CPU implementation.
December 4, 2010 by hgpu