16653

OpenMP, OpenMP/MPI, and CUDA/MPI C programs for solving the time-dependent dipolar Gross-Pitaevskii equation

Vladimir Loncar, Luis E. Young-S., Srdjan Skrbic, Paulsamy Muruganandam, Sadhan K. Adhikari, Antun Balaz
Scientific Computing Laboratory, Center for the Study of Complex Systems, Institute of Physics Belgrade, University of Belgrade, Pregrevica 118, 11080 Belgrade, Serbia
arXiv:1610.05329 [cond-mat.quant-gas], (17 Oct 2016)

@article{loncar2016openmp,

   title={OpenMP, OpenMP/MPI, and CUDA/MPI C programs for solving the time-dependent dipolar Gross-Pitaevskii equation},

   author={Loncar, Vladimir and ., Luis E. Young-S and Skrbic, Srdjan and Muruganandam, Paulsamy and Adhikari, Sadhan K. and Balaz, Antun},

   year={2016},

   month={oct},

   archivePrefix={"arXiv"},

   primaryClass={cond-mat.quant-gas},

   doi={10.1016/j.cpc.2016.07.029}

}

Download Download (PDF)   View View   Source Source   Source codes Source codes

1601

views

We present new versions of the previously published C and CUDA programs for solving the dipolar Gross-Pitaevskii equation in one, two, and three spatial dimensions, which calculate stationary and non-stationary solutions by propagation in imaginary or real time. Presented programs are improved and parallelized versions of previous programs, divided into three packages according to the type of parallelization. First package contains improved and threaded version of sequential C programs using OpenMP. Second package additionally parallelizes three-dimensional variants of the OpenMP programs using MPI, allowing them to be run on distributed-memory systems. Finally, previous three-dimensional CUDA-parallelized programs are further parallelized using MPI, similarly as the OpenMP programs. We also present speedup test results obtained using new versions of programs in comparison with the previous sequential C and parallel CUDA programs. The improvements to the sequential version yield a speedup of 1.1 to 1.9, depending on the program. OpenMP parallelization yields further speedup of 2 to 12 on a 16-core workstation, while OpenMP/MPI version demonstrates a speedup of 11.5 to 16.5 on a computer cluster with 32 nodes used. CUDA/MPI version shows a speedup of 9 to 10 on a computer cluster with 32 nodes.
Rating: 1.8/5. From 3 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: