Creating Optimal Code for GPU-Accelerated CT Reconstruction Using Ant Colony Optimization

Eric Papenhausen, Ziyi Zheng, Klaus Mueller
Visual Analytics and Imaging Lab, Center of Visual Computing, Computer Science Department, Stony Brook University, Stony Brook, New York 11794-4400
Medical Physics, 3(40): 031110, 2013


   title={Creating optimal code for GPU-accelerated CT reconstruction using ant colony optimization},

   author={Papenhausen, Eric and Zheng, Ziyi and Mueller, Klaus},

   journal={Medical Physics},





Download Download (PDF)   View View   Source Source   



PURPOSE: CT reconstruction algorithms implemented on the GPU are highly sensitive to their implementation details and the hardware they run on. Fine-tuning an implementation for optimal performance can be a time consuming task and require many updates when the hardware changes. There are some techniques that do automatic fine-tuning of GPU code. These techniques, however, are relatively narrow in their fine-tuning and are often based on heuristics which can be inaccurate. The goal of this paper is to present a framework that will automate the process of code optimization with maximum flexibility and produce a final result that is efficient and readable to the user. METHODS: The authors propose a method that is able to tune high level implementation details by using the ant colony optimization algorithm to find the optimal implementation in a relatively short amount of time. Our framework does this by taking as input, a file that describes a graph, such that a path through this graph represents a potential implementation. They then use the ant colony optimization algorithm to find the optimal path through this graph based on the execution time and the quality of the image. RESULTS: Two experimental studies are carried out. Using the presented framework, they optimize the performance of a GPU accelerated FDK backprojection implementation and a GPU accelerated separable footprint backprojection implementation. The authors demonstrate that the resulting optimal implementation can be different depending on the hardware specifications. They then compare the results of the framework produced with the results produced by manual optimization. CONCLUSIONS: The framework they present is a useful tool for increasing programmer productivity and reducing the overhead of leveraging hardware specific resources. By performing an intelligent search, our framework produces a more efficient image reconstruction implementation in a shorter amount of time.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: