Compute Pairwise Manhattan Distance and Pearson Correlation Coefficient of Data Points with GPU
Computer Engineering & Computer Science Department, University of Louisville, Louisville, Kentucky 40292, United States of America
In SNPD ’09: Proceedings of the 2009 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing (2009), pp. 501-506.
@conference{chang2009compute,
title={Compute pairwise Manhattan distance and Pearson correlation coefficient of data points with GPU},
author={Chang, D.J. and Desoky, A.H. and Ouyang, M. and Rouchka, E.C.},
booktitle={| 2009 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing},
pages={501–506},
year={2009},
organization={IEEE}
}
Graphics processing units (GPUs) are powerful computational devices tailored towards the needs of the 3-D gaming industry for high-performance, real-time graphics engines. Nvidia Corporation released a new generation of GPUs designed for general-purpose computing in 2006, and it released a GPU programming language called CUDA in 2007. The DNA microarray technology is a high throughput tool for assaying mRNA abundance in cell samples. In data analysis, scientists often apply hierarchical clustering of the genes, where a fundamental operation is to calculate all pairwise distances. If there are n genes, it takes O(n^2) time. In this work, GPUs and the CUDA language are used to calculate pairwise distances. For Manhattan distance, GPU/CUDA achieves a 40 to 90 times speed-up compared to the central processing unit implementation; for Pearson correlation coefficient, the speed-up is 28 to 38 times.
October 28, 2010 by hgpu