high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Accelerating Discrete Wavelet Transforms on Parallel Architectures

Accelerating Discrete Wavelet Transforms on Parallel Architectures

David Barina, Michal Kula, Michal Matysek, Pavel Zemcik

Centre of Excellence IT4Innovations, Faculty of Information Technology, Brno University of Technology, Bozetechova 1/2, Brno, Czech Republic

arXiv:1704.08657 [cs.PF], (27 Apr 2017)

@article{barina2017accelerating,

title={Accelerating Discrete Wavelet Transforms on Parallel Architectures},

author={Barina, David and Kula, Michal and Matysek, Michal and Zemcik, Pavel},

year={2017},

month={apr},

archivePrefix={"arXiv"},

primaryClass={cs.PF}

}

Download (PDF)

View

Source

2575

views

The 2-D discrete wavelet transform (DWT) can be found in the heart of many image-processing algorithms. Until recently, several studies have compared the performance of such transform on various shared-memory parallel architectures, especially on graphics processing units (GPUs). All these studies, however, considered only separable calculation schemes. We show that corresponding separable parts can be merged into non-separable units, which halves the number of steps. In addition, we introduce an optional optimization approach leading to a reduction in the number of arithmetic operations. The discussed schemes were adapted on the OpenCL framework and pixel shaders, and then evaluated using GPUs of two biggest vendors. We demonstrate the performance of the proposed non-separable methods by comparison with existing separable schemes. The non-separable schemes outperform their separable counterparts on numerous setups, especially considering the pixel shaders.

Tags: Algorithms, ATI, ATI Radeon HD 6970, Discrete Wavelet Transform, Image processing, nVidia, nVidia GeForce GTX Titan X, OpenCL, OpenGL, Performance, Pixel shaders

April 30, 2017 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Accelerating Discrete Wavelet Transforms on Parallel Architectures

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Accelerating Discrete Wavelet Transforms on Parallel Architectures

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)