19251

Accelerating Molecular Docking by Parallelized Heterogeneous Computing – A Case Study of Performance, Quality of Results, and Energy-Efficiency using CPUs, GPUs, and FPGAs

Leonardo Solis Vasquez
Computer Science Department, Technische Universität Darmstadt
Technische Universität Darmstadt, 2019

@phdthesis{tuprints9288,

   school={Technische Universit{"a}t},

   title={Accelerating Molecular Docking by Parallelized Heterogeneous Computing – A Case Study of Performance, Quality of Results, and Energy-Efficiency using CPUs, GPUs, and FPGAs},

   year={2019},

   month={November},

   author={Leonardo Solis Vasquez},

   address={Darmstadt},

   url={http://tuprints.ulb.tu-darmstadt.de/9288/}

}

Molecular Docking (MD) is a key tool in computer-aided drug design that aims to predict the binding pose between a small molecule and a macromolecular target. At its core, MD calculates the strength of possible binding poses, and searches for the energetically-stronger ones among those generated during simulation. Automatic Docking (AutoDock) is a widely-used MD code that employs a physics-based scoring function to quantify the binding strength. AutoDock also uses a Lamarckian Genetic Algorithm (LGA), and in turn, the Solis-Wets method, as a local-search algorithm, in order to find strong interactions of such molecular systems. Due to the highly-parallel nature of the LGA tasks involved, AutoDock can benefit from runtime acceleration based on parallelization. This thesis presents an OpenCL-based parallelization of AutoDock, and a corresponding evaluation in terms of execution performance, quality-of-results, and compute-energy efficiency, achieved on different platforms based on: multi-core Central Processing Unit (CPU)s, Graphics Processing Unit (GPU)s, and Field Programmable Gate Array (FPGA)s. While a data-parallel approach has proven its effectiveness in accelerating AutoDock on CPUs and GPUs, it was observed that for FPGAs, such approach resulted in slower executions in the range of three-orders of magnitude when compared against the original single-threaded AutoDock. To overcome this drawback, a task-parallel implementation for FPGAs is discussed as well. Besides presenting an AutoDock implementation being parallelized using OpenCL, this thesis also extends the LGA search with new alternative local-search methods based on gradients (of the scoring function) such as: Steepest-Descent, FIRE, and ADADELTA. Among these, it was found that ADADELTA provides significant algorithmic benefits over Solis-Wets, yielding a reduction in calculation effort down to 1/1300 of the legacy Solis-Wets method, while achieving equivalent quality-of-results. Compared to the original single-threaded AutoDock, the proposed data-parallel design achieves a speedup of up to ~399x and improves the compute-energy efficiency by up to ~297x when running on modern V100 GPUs. Furthermore, this thesis describes the adaptations performed on the proposed OpenCL-based implementation for supporting challenging real-world MD scenarios.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: