high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » On the Use of Small 2D Convolutions on GPUs

On the Use of Small 2D Convolutions on GPUs

Shams A.H. Al Umairy, Alexander S. van Amesfoort, Irwan D. Setija, Martijn C. van Beurden, Henk J. Sips

Delft University of Technology, Delft, The Netherlands

Computer Architecture, Lecture Notes in Computer Science, Volume 6161/2012, 52-64, 2012

DOI:10.1007/978-3-642-24322-6_6

@inproceedings{al2012use,

title={On the Use of Small 2D Convolutions on GPUs},

author={Al Umairy, S. and van Amesfoort, A. and Setija, I. and van Beurden, M. and Sips, H.},

booktitle={Computer Architecture},

pages={52–64},

year={2012},

organization={Springer}

}

Download (PDF)

View

Source

1491

views

Computing many small 2D convolutions using FFTs is a basis for a large number of applications in many domains in science and engineering, among them electromagnetic diffraction modeling in physics. The GPU architecture seems to be a suitable architecture to accelerate these convolutions, but reaching high application performance requires substantial development time and non-portable optimizations. In this work, we present the techniques, performance results and considerations to accelerate small 2D convolutions using CUDA, and compare performance to a multi-threaded CPU implementation. To improve programmability and performance of applications that make heavy use of small convolutions, we argue that two improvements to software and hardware are needed: FFT libraries must be extended with a single convolution function and communication bandwidth between CPU and GPU needs to be drastically improved.

Tags: Computer science, CUDA, Electrodynamics, FFT, nVidia, nVidia GeForce 8800 GTX, nVidia GeForce GTX 280, Optimization, Physics, Tesla C1060

March 15, 2012 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

On the Use of Small 2D Convolutions on GPUs

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)

On the Use of Small 2D Convolutions on GPUs

Share this:

Recent source codes

Most viewed papers (last 30 days)