GPU-accelerated Chemical Similarity Assessment for Large Scale Databases

hgpu.org » Programming » Algorithms » GPU-accelerated Chemical Similarity Assessment for Large Scale Databases

GPU-accelerated Chemical Similarity Assessment for Large Scale Databases

Marco Maggioni, Marco Domenico Santambrogio, Jie Liang

Department of Computer Science, University of Illinois at Chicago

Procedia Computer Science, Volume 4, Pages 2007-2016, Proceedings of the International Conference on Computational Science (ICCS 2011), 2011

DOI:10.1016/j.procs.2011.04.219

@article{maggionia2011gpu,

title={GPU-accelerated Chemical Similarity Assessment for Large Scale Databases},

author={Maggionia, M. and Santambrogioa, M.D. and Lianga, J.},

journal={Procedia Computer Science},

volume={4},

pages={2007–2016},

year={2011},

publisher={Elsevier BV}

}

Download (PDF)

View

Source

2191

views

The assessment of chemical similarity between molecules is a basic operation in chemoinformatics, a computational area concerning with the manipulation of chemical structural information. Comparing molecules is the basis for a wide range of applications such as searching in chemical databases, training prediction models for virtual screening or aggregating clusters of similar compounds. However, currently available multimillion databases represent a challenge for conventional chemoinformatics algorithms raising the necessity for faster similarity methods. In this paper, we extensively analyze the advantages of using many-core architectures for calculating some commonly-used chemical similarity coefficients such as Tanimoto, Dice or Cosine. Our aim is to provide a wide-breath proof-of-concept regarding the usefulness of GPU architectures to chemoinformatics, a class of computing problems still uncovered. In our work, we present a general GPU algorithm for all-to-all chemical comparisons considering both binary fingerprints and floating point descriptors as molecule representation. Subsequently, we adopt optimization techniques to minimize global memory accesses and to further improve efficiency. We test the proposed algorithm on different experimental setups, a laptop with a low-end GPU and a desktop with a more performant GPU. In the former case, we obtain a 4-to-6-fold speed-up over a single-core implementation for fingerprints and a 4-to-7-fold speed-up for descriptors. In the latter case, we respectively obtain a 195-to-206-fold speed-up and a 100-to-328-fold speed-up.

Tags: Algorithms, Chemistry, CUDA, Databases, nVidia, nVidia GeForce 9400 M, nVidia GeForce GTX 460, Optimization

December 27, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org