Systematic Performance Optimization of Cone-Beam Back-Projection on the Kepler Architecture

Timo Zinsser, Benjamin Keck
Siemens AG, Healthcare Sector, Imaging & IT Division, P.O. Box 1266, D-91294 Forchheim, Germany
The 12th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and NUclear Medicine, 2013


   author={Timo Zinsser and Benjamin Keck},

   editor={Fully3D committee},


   location={Lake Tahoe, CA, USA},

   booktitle={Proceedings of the 12th Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine},

   title={Systematic Performance Optimization of Cone-Beam Back-Projection on the Kepler Architecture},



   bibsource={UnivIS, http://univis.uni-erlangen.de/prg?search=publications&id=91372382&show=elong}


Download Download (PDF)   View View   Source Source   



Filtered back-projection algorithms are widely used for the reconstruction of volumetric data from cone-beam projections in interventional C-arm computed tomography. Furthermore, general-purpose GPUs have become a popular tool for accelerating the reconstruction during time-critical clinical procedures. In this work, we focus on the systematic performance optimization of cone-beam back-projection on the latest architecture of CUDA-enabled GPUs. Our optimization approach is based on the identification of the major performance bottleneck through the analysis of specifically modified kernels. Our main contribution is a smart restructuring of the backprojection algorithm that facilitates the simultaneous processing of a large number of projections and improves the hit rate of the texture cache at the same time. We use the well-known RabbitCT benchmark to demonstrate the outstanding performance of our implementation on a single Kepler-based GeForce GTX 680 GPU. Our implementation performs the back-projection of 496 input projections onto a cubic 5123 volume in less than one second, which is three times as fast as the best competing implementation. Our back-projection implementation is also able to reconstruct a cubic 10243 volume in about six seconds, which is six times as fast as the best competing implementation known to us.
VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Recent source codes

* * *

* * *

TwitterAPIExchange Object
    [oauth_access_token:TwitterAPIExchange:private] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
    [oauth_access_token_secret:TwitterAPIExchange:private] => o29ji3VLVmB6jASMqY8G7QZDCrdFmoTvCDNNUlb7s
    [consumer_key:TwitterAPIExchange:private] => TdQb63pho0ak9VevwMWpEgXAE
    [consumer_secret:TwitterAPIExchange:private] => Uq4rWz7nUnH1y6ab6uQ9xMk0KLcDrmckneEMdlq6G5E0jlQCFx
    [postfields:TwitterAPIExchange:private] => 
    [getfield:TwitterAPIExchange:private] => ?cursor=-1&screen_name=hgpu&skip_status=true&include_user_entities=false
    [oauth:protected] => Array
            [oauth_consumer_key] => TdQb63pho0ak9VevwMWpEgXAE
            [oauth_nonce] => 1488245138
            [oauth_signature_method] => HMAC-SHA1
            [oauth_token] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
            [oauth_timestamp] => 1488245138
            [oauth_version] => 1.0
            [cursor] => -1
            [screen_name] => hgpu
            [skip_status] => true
            [include_user_entities] => false
            [oauth_signature] => YjLMeJHK64LjSu38hL8Jwk34UXE=

    [url] => https://api.twitter.com/1.1/users/show.json
Follow us on Facebook
Follow us on Twitter

HGPU group

2173 peoples are following HGPU @twitter

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: