Effects of Easy Hybrid Parallelization with CUDA for Numerical-Atomic-Orbital Density Functional Theory Calculation
QoLT IIDC and School of Earth and Environmental Sciences, Seoul National University, Seoul 151-747, Korea
arXiv:1402.4247 [cs.DC], (18 Feb 2014)
@article{2014arXiv1402.4247P,
author={Parq}, J.-H. and {Sevre}, E. and {Lee}, S.-M.},
title={"{Effects of Easy Hybrid Parallelization with CUDA for Numerical-Atomic-Orbital Density Functional Theory Calculation}"},
journal={ArXiv e-prints},
archivePrefix={"arXiv"},
eprint={1402.4247},
primaryClass={"cs.DC"},
keywords={Computer Science – Distributed, Parallel, and Cluster Computing},
year={2014},
month={feb},
adsurl={http://adsabs.harvard.edu/abs/2014arXiv1402.4247P},
adsnote={Provided by the SAO/NASA Astrophysics Data System}
}
We modified a MPI-friendly density functional theory (DFT) source code within hybrid parallelization including CUDA. Our objective is to find out how simple conversions within the hybrid parallelization with mid-range GPUs affect DFT code not originally suitable to CUDA. We settled several rules of hybrid parallelization for numerical-atomic-orbital (NAO) DFT codes. The test was performed on a magnetite material system with OpenMX code by utilizing a hardware system containing 2 Xeon E5606 CPUs and 2 Quadro 4000 GPUs. 3-way hybrid routines obtained a speedup of 7.55 while 2-way hybrid speedup by 10.94. GPUs with CUDA complement the efficiency of OpenMP and compensate CPUs’ excessive competition within MPI.
February 21, 2014 by hgpu
Your response
You must be logged in to post a comment.