high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Computer vision » Parallelization Strategies of the Canny Edge Detector for Multi-core CPUs and Many-core GPUs

Parallelization Strategies of the Canny Edge Detector for Multi-core CPUs and Many-core GPUs

Taieb Lamine Ben Cheikh, Giovanni Beltrame, Gabriela Nicolescu, Farida Cheriet, Sofiene Tahar

Department of Computer Science, Ecole Polytechnique de Montreal, Montreal, Canada

IEEE Northeast Workshop on Circuits and Systems (NEWCAS’12), 2012

BibTeX

Download (PDF)

View

Source

1977

views

In this paper we study two parallelization strategies (loop-level parallelism and domain decomposition), and we investigate their impact in terms of performance and scalability on two different parallel architectures. As a test application, we use the Canny Edge Detector due to its wide range of parallelization opportunities, and its frequent use in computer vision applications. Different parallel implementations of the Canny Edge Detector are run on two distinct hardware platforms, namely a multi-core CPU, and a many-core GPU. Our experiments uncover design rules that, depending on a set of applications and platform factors (parallel features, data size, and architecture), indicate which parallelization scheme is more suitable.

Tags: Computer science, Computer vision, CUDA, nVidia, nVidia GeForce GTX 480

July 2, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Parallelization Strategies of the Canny Edge Detector for Multi-core CPUs and Many-core GPUs

Your response

Recent source codes

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

Most viewed papers (last 30 days)

Parallelization Strategies of the Canny Edge Detector for Multi-core CPUs and Many-core GPUs

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)