General-purpose molecular dynamics simulations on GPU-based clusters

hgpu.org » Programming » CUDA » General-purpose molecular dynamics simulations on GPU-based clusters

General-purpose molecular dynamics simulations on GPU-based clusters

Christian R. Trott, Lars Winterfeld, Paul S. Crozier

Institut fur Physik, University of Technology Ilmenau, 98684 Ilmenau, Germany

arXiv:1009.4330v2 [cond-mat.mtrl-sci] (6 Mar 2011)

@article{2010arXiv1009.4330T,

title={“{General-purpose molecular dynamics simulations on GPU-based clusters}”},

author={Trott}, C.~R. and {Winterfeld}, L. and {Crozier}, P.~S.},

journal={Arxiv preprint arXiv:1009.4330},

archivePrefix={“arXiv”},

eprint={1009.4330},

primaryClass={“cond-mat.mtrl-sci”},

keywords={Condensed Matter – Materials Science, Computer Science – Distributed, Parallel, and Cluster Computing, Computer Science – Performance, Physics – Computational Physics},

year={2010},

month={sep},

adsurl={http://adsabs.harvard.edu/abs/2010arXiv1009.4330T},

adsnote={Provided by the SAO/NASA Astrophysics Data System}

}

Download (PDF)

View

Source

Source codes

Package:

LAMMPS MD simulations

2026

views

We present a GPU implementation of LAMMPS, a widely-used parallel molecular dynamics (MD) software package, and show 5x to 13x single node speedups versus the CPU-only version of LAMMPS. This new CUDA package for LAMMPS also enables multi-GPU simulation on hybrid heterogeneous clusters, using MPI for inter-node communication, CUDA kernels on the GPU for all methods working with particle data, and standard LAMMPS C++ code for CPU execution. Cell and neighbor list approaches are compared for best performance on GPUs, with thread-per-atom and block-per-atom neighbor list variants showing best performance at low and high neighbor counts, respectively. Computational performance results of GPU-enabled LAMMPS are presented for a variety of materials classes (e.g. biomolecules, polymers, metals, semiconductors), along with a speed comparison versus other available GPU-enabled MD software. Finally, we show strong and weak scaling performance on a CPU/GPU cluster using up to 128 dual GPU nodes.

Tags: Computational Physics, Condensed matter, CUDA, GPU cluster, Materials Science, Molecular dynamics, MPI, nVidia, nVidia GeForce GTX 280, nVidia GeForce GTX 295, Package, Performance, Physics, Tesla C1060, Tesla S1070

March 8, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org