Compute Pairwise Manhattan Distance and Pearson Correlation Coefficient of Data Points with GPU
Computer Engineering & Computer Science Department, University of Louisville, Louisville, Kentucky 40292, United States of America
In SNPD ’09: Proceedings of the 2009 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing (2009), pp. 501-506.
Graphics processing units (GPUs) are powerful computational devices tailored towards the needs of the 3-D gaming industry for high-performance, real-time graphics engines. Nvidia Corporation released a new generation of GPUs designed for general-purpose computing in 2006, and it released a GPU programming language called CUDA in 2007. The DNA microarray technology is a high throughput tool for assaying mRNA abundance in cell samples. In data analysis, scientists often apply hierarchical clustering of the genes, where a fundamental operation is to calculate all pairwise distances. If there are n genes, it takes O(n^2) time. In this work, GPUs and the CUDA language are used to calculate pairwise distances. For Manhattan distance, GPU/CUDA achieves a 40 to 90 times speed-up compared to the central processing unit implementation; for Pearson correlation coefficient, the speed-up is 28 to 38 times.
October 28, 2010 by hgpu