high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor

An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor

Vladimir Mironov, Yuri Alexeev, Kristopher Keipert, Michael D’mello, Alexander Moskovsky, Mark S. Gordon

Lomonosov Moscow State University, Leninskie Gory 1/3, Moscow 119991, Russian Federation

arXiv:1708.00033 [cs.DC], (31 Jul 2017)

DOI:10.1145/3126908.3126956

@article{mironov2017efficient,

title={An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor},

author={Mironov, Vladimir and Alexeev, Yuri and Keipert, Kristopher and D’mello, Michael and Moskovsky, Alexander and Gordon, Mark S.},

year={2017},

month={jul},

archivePrefix={"arXiv"},

primaryClass={cs.DC},

doi={10.1145/3126908.3126956}

}

Download (PDF)

View

Source

2171

views

Modern OpenMP threading techniques are used to convert the MPI-only Hartree-Fock code in the GAMESS program to a hybrid MPI/OpenMP algorithm. Two separate implementations that differ by the sharing or replication of key data structures among threads are considered, density and Fock matrices. All implementations are benchmarked on a super-computer of 3,000 Intel Xeon Phi processors. With 64 cores per processor, scaling numbers are reported on up to 192,000 cores. The hybrid MPI/OpenMP implementation reduces the memory footprint by approximately 200 times compared to the legacy code. The MPI/OpenMP code was shown to run up to six times faster than the original for a range of molecular system sizes.

Tags: Algorithms, Benchmarking, Chemistry, Computer science, Intel Xeon Phi, MPI, OpenMP, Physics

August 8, 2017 by hgpu

Rating: 3.0/5. From 2 votes.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)

An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor

Share this:

Recent source codes

Most viewed papers (last 30 days)