25332

Optimisation and GPU code generation of Stencils for Futhark

Christian Charlie Virt, Jonathan Wraa-Hansen
Department of Computer Science, University of Copenhagen
University of Copenhagen, 2021

@article{virt2021optimisation,

   title={Optimisation and GPU code generation of Stencils for Futhark},

   author={Virt, Christian Charlie and Wraa-Hansen, Jonathan},

   year={2021}

}

Download Download (PDF)   View View   Source Source   Source codes Source codes

1229

views

Stencils are a common problem in the area of scientific computing. Exploitation of parallel computing is a central part when optimising for faster execution times of stencils running on large amounts of data. For this reason stencils are well suited to be run in a GPGPU setting. However, programming stencils to run on massively-parallel hardware is a time-consuming and error-prone exercise. For this reason it is useful to be able to express these stencils in a more abstract form, in a high-level programming language. Then a compiler will translate the stencil into more efficient and parallel computations in a GPGPU setting. Futhark is a high-level programming language, with the purpose of producing efficient multi-threaded CPU, CUDA and OpenCL programs. However, it has no native support for stencils. This thesis concerns the implementation of code generation for a stencil construct for the Futhark OpenCL and CUDA back-ends of the compiler. We investigate many designs for running stencils in a GPGPU setting, and analyse these different designs. We then choose the most efficient and robust prototype, to guide our implementation of code generation of the stencil construct in the Futhark compiler. The implemented stencil construct provides significant speedups compared to what could already be done with a nested map implementation in Futhark. For some hardware and stencils we achieve up to three times speedup.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: