high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Biology » Efficient Implementation of MrBayes on multi-GPU

Efficient Implementation of MrBayes on multi-GPU

Jie Bao, Hongju Xia, Jianfu Zhou, Xiaoguang Liu, Gang Wang

College of Information Technical Science, Nankai University, Tianjin, China

College of Information Technical Science, Nankai University, 2013

@article{xia2013efficient,

title={PDF Proof: Mol. Biol. Evol.},

author={Xia, H. and Zhou, J. and Wang, G.},

year={2013}

}

Download (PDF)

View

Source

Source codes

Package:

GPU MrBayes

2172

views

MrBayes, using Metropolis coupled Markov chain Monte Carlo [MCMCMC, or (MC)^3 for short], is a popular program for Bayesian inference. As a leading method of using DNA data to infer phylogeny, now the (MC)^3 Bayesian algorithm and its improved and parallel versions are all not fast enough for Biologists to analyze massive real-world DNA data. Recently Graphics Processor Unit (GPU) has shown its power as a co-processor (or rather, an accelerator) in many fields. This paper describes an efficient implementation a(MC)^3 [aMCMCMC] for MrBayes (MC)^3 on Compute Unified Device Architecture (CUDA). By dynamically adjusting the task granularity to adapt to input data size and hardware configuration, it makes full use of GPU cores with different data sets. An adaptive method is also developed to split and combine DNA sequences to make full use of a large number of GPU cards. Furthermore, a new "node-by-node" task scheduling strategy is developed to improve concurrency, and several optimizing methods are used to reduce extra overhead. Experimental results show that a(MC)^3 achieves up to 55x speedup over serial MrBayes on a single machine with one GPU card, and up to 154x speedup with four GPU cards, and up to 439x speedup with a 32-node GPU cluster. a(MC)^3 is dramatically faster than all the previous (MC)^3 algorithms and scales well to large GPU clusters.

Tags: Bayesian, Biology, CUDA, GPU cluster, nVidia, nVidia GeForce GTX 480, Package, Task scheduling

January 28, 2013 by hgpu

Rating: 2.5/5. From 3 votes.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Efficient Implementation of MrBayes on multi-GPU

Package:

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Efficient Implementation of MrBayes on multi-GPU

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)