high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Physics » Astrophysics » Using hybrid GPU/CPU kernel splitting to accelerate spherical convolutions

Using hybrid GPU/CPU kernel splitting to accelerate spherical convolutions

P. M. Sutter, Benjamin D. Wandelt, Franz Elsner

INFN – National Institute for Nuclear Physics, via Valerio 2, I-34127 Trieste, Italy

arXiv:1409.4441 [astro-ph.CO], (15 Sep 2014)

@article{2014arXiv1409.4441S,

author={Sutter}, P.~M. and {Wandelt}, B.~D. and {Elsner}, F.},

title={"{Using hybrid GPU/CPU kernel splitting to accelerate spherical convolutions}"},

journal={ArXiv e-prints},

archivePrefix={"arXiv"},

eprint={1409.4441},

keywords={Astrophysics – Cosmology and Nongalactic Astrophysics, Astrophysics – Instrumentation and Methods for Astrophysics},

year={2014},

month={sep},

adsurl={http://adsabs.harvard.edu/abs/2014arXiv1409.4441S},

adsnote={Provided by the SAO/NASA Astrophysics Data System}

}

Download (PDF)

View

Source

2311

views

We present a general method for accelerating by more than an order of magnitude the convolution of pixelated functions on the sphere with a radially-symmetric kernel. Our method splits the kernel into a compact real-space component and a compact spherical harmonic space component. These components can then be convolved in parallel using an inexpensive commodity GPU and a CPU. We provide models for the computational cost of both real-space and Fourier space convolutions and an estimate for the approximation error. Using these models we can determine the optimum split that minimizes the wall clock time for the convolution while satisfying the desired error bounds. We apply this technique to the problem of simulating a cosmic microwave background (CMB) anisotropy sky map at the resolution typical of the high resolution maps produced by the Planck mission. For the main Planck CMB science channels we achieve a speedup of over a factor of ten, assuming an acceptable fractional rms error of order 1.e-5 in the power spectrum of the output map.

Tags: Astrophysics, Cosmology, CUDA, nVidia, nVidia GeForce GTX 480

September 17, 2014 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Using hybrid GPU/CPU kernel splitting to accelerate spherical convolutions

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Using hybrid GPU/CPU kernel splitting to accelerate spherical convolutions

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)