high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Chemistry » Adapting Irregular Computations to Large CPU-GPU Clusters in the MADNESS Framework

Adapting Irregular Computations to Large CPU-GPU Clusters in the MADNESS Framework

Vlad Slavici, Raghu Varier, Gene Cooperman, Robert J. Harrison

Northeastern University, Boston, MA

IEEE International Conference on Cluster Computing (CLUSTER), 2012

DOI:10.1109/CLUSTER.2012.42

BibTeX

Download (PDF)

View

Source

2209

views

Graphics Processing Units (GPUs) are becoming the workhorse of scalable computations. MADNESS is a scientific framework used especially for computational chemistry. Most MADNESS applications use operators that involve many small tensor computations, resulting in a less regular organization of computations on GPUs. A single GPU kernel may have to multiply by hundreds of small square matrices (with fixed dimension ranging from 10 to 28). We demonstrate a scalable CPU-GPU implementation of the MADNESS framework over a 500-node partition on the Titan supercomputer. For this hybrid CPU-GPU implementation, we observe up to a 2.3-times speedup compared to an equivalent CPU-only implementation with 16 cores per node. For smaller matrices, we demonstrate a speedup of 2.2-times by using a custom CUDA kernel rather than a cuBLAS-based kernel.

Tags: Chemistry, Computational chemistry, Computer science, CUBLAS, CUDA, GPU cluster, nVidia, Tesla M2090

November 6, 2012 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

Adapting Irregular Computations to Large CPU-GPU Clusters in the MADNESS Framework

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Adapting Irregular Computations to Large CPU-GPU Clusters in the MADNESS Framework

Share this:

Recent source codes

Most viewed papers (last 30 days)