Multi-GPU implementation of a VMAT treatment plan optimization algorithm

Zhen Tian, Fei Peng, Michael Folkerts, Jun Tan, Xun Jia, Steve B. Jiang
Department of Radiation Oncology, University of Texas, Southwestern Medical Center, Dallas, TX 75390
arXiv:1503.01721 [physics.med-ph], (5 Mar 2015)


   title={Multi-GPU implementation of a VMAT treatment plan optimization algorithm},

   author={Tian, Zhen and Peng, Fei and Folkerts, Michael and Tan, Jun and Jia, Xun and Jiang, Steve B.},






Download Download (PDF)   View View   Source Source   



VMAT optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units have been used to speed up the computations. However, its small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix. This paper is to report an implementation of our column-generation based VMAT algorithm on a multi-GPU platform to solve the memory limitation problem. The column-generation approach generates apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. The DDC matrix is split into four sub-matrices according to beam angles, stored on four GPUs in compressed sparse row format. Computation of beamlet price is accomplished using multi-GPU. While the remaining steps of PP and MP problems are implemented on a single GPU due to their modest computational loads. A H&N patient case was used to validate our method. We compare our multi-GPU implementation with three single GPU implementation strategies: truncating DDC matrix (S1), repeatedly transferring DDC matrix between CPU and GPU (S2), and porting computations involving DDC matrix to CPU (S3). Two more H&N patient cases and three prostate cases were also used to demonstrate the advantages of our method. Our multi-GPU implementation can finish the optimization within ~1 minute for the H&N patient case. S1 leads to an inferior plan quality although its total time was 10 seconds shorter than the multi-GPU implementation. S2 and S3 yield same plan quality as the multi-GPU implementation but take ~4 minutes and ~6 minutes, respectively. High computational efficiency was consistently achieved for the other 5 cases. The results demonstrate that the multi-GPU implementation can handle the large-scale VMAT optimization problem efficiently without sacrificing plan quality.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: