high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Parallel Implementation of Similarity Measures on GPU Architecture using CUDA

Parallel Implementation of Similarity Measures on GPU Architecture using CUDA

Kuldeep Yadav, Ankush Mittal, M.A Ansari, Vennktesh Vishwarup

Department of Computer Science and Engineering, College of Engineering Roorkee, Roorkee-247667, INDIA

Indian Journal of Computer Science and Engineering (IJCSE), Vol. 3 No. 1, 2012

BibTeX

Download (PDF)

View

Source

1993

views

Image processing and pattern recognition algorithms take more time for execution on a single core processor. Graphics Processing Unit (GPU) is more popular now-a-days due to their speed, programmability, low cost and more inbuilt execution cores in it. Most of the researchers started work to use GPUs as a processing unit with a single core computer system to speedup execution of algorithms and in the field of Content based medical image retrieval (CBMIR), Euclidean distance and Mahalanobis plays an important role in retrieval of images. Distance formula is important because it plays an important role in matching the images. In this research work, we parallelized Euclidean distance algorithm on CUDA. CPU with Intel Dual-Core E5500 @ 2.80GHz and 2.0 GB of main memory which run on Windows XP (SP2). The next step was to convert this code in GPU format i.e. to run this program on GPU NVIDIA GeForce series 9500GT model having 1023 MB of video memory of DDR2 type and bus width of 64bit. The graphic driver we used is of 270.81 series of NVIDIA. In this paper both the CPU and GPU version of algorithm is being implemented on the MATLAB R2010. The CPU version of the algorithm is being analyzed in simple MATLAB but the GPU version is being implemented with the help of intermediate software Jacket-win-1.3.0. For using Jacket, we have to make some changes in our source code so to make the CPU and GPU to work simultaneously and thus reducing the overall computational acceleration . Our work employs extensive usage of highly multithreaded architecture of multicored GPU. An efficient use of shared memory is required to optimize parallel reduction in Compute Unified Device Architecture (CUDA), Graphic Processing Units (GPUs) are emerging as powerful parallel systems at a cheap cost of a few thousand rupees.

Tags: Algorithms, Computer vision, CUDA, Image processing, nVidia, nVidia GeForce 9500 GT, Pattern recognition

March 2, 2012 by hgpu

Rating: 2.5/5. From 2 votes.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

Parallel Implementation of Similarity Measures on GPU Architecture using CUDA

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Most viewed papers (last 30 days)

Parallel Implementation of Similarity Measures on GPU Architecture using CUDA

Share this:

Recent source codes

Most viewed papers (last 30 days)