11437

Effects of Easy Hybrid Parallelization with CUDA for Numerical-Atomic-Orbital Density Functional Theory Calculation

Jae-Hyeon Parq, Erik Sevre, Sang-Mook Lee
QoLT IIDC and School of Earth and Environmental Sciences, Seoul National University, Seoul 151-747, Korea
arXiv:1402.4247 [cs.DC], (18 Feb 2014)

@article{2014arXiv1402.4247P,

   author={Parq}, J.-H. and {Sevre}, E. and {Lee}, S.-M.},

   title={"{Effects of Easy Hybrid Parallelization with CUDA for Numerical-Atomic-Orbital Density Functional Theory Calculation}"},

   journal={ArXiv e-prints},

   archivePrefix={"arXiv"},

   eprint={1402.4247},

   primaryClass={"cs.DC"},

   keywords={Computer Science – Distributed, Parallel, and Cluster Computing},

   year={2014},

   month={feb},

   adsurl={http://adsabs.harvard.edu/abs/2014arXiv1402.4247P},

   adsnote={Provided by the SAO/NASA Astrophysics Data System}

}

Download Download (PDF)   View View   Source Source   

1880

views

We modified a MPI-friendly density functional theory (DFT) source code within hybrid parallelization including CUDA. Our objective is to find out how simple conversions within the hybrid parallelization with mid-range GPUs affect DFT code not originally suitable to CUDA. We settled several rules of hybrid parallelization for numerical-atomic-orbital (NAO) DFT codes. The test was performed on a magnetite material system with OpenMX code by utilizing a hardware system containing 2 Xeon E5606 CPUs and 2 Quadro 4000 GPUs. 3-way hybrid routines obtained a speedup of 7.55 while 2-way hybrid speedup by 10.94. GPUs with CUDA complement the efficiency of OpenMP and compensate CPUs’ excessive competition within MPI.
No votes yet.
Please wait...

You must be logged in to post a comment.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: