Improving 3D Lattice Boltzmann Method stencil with asynchronous transfers on many-core processors

Minh Quan Ho, Christian Obrecht, Bernard Tourancheau, Benoit Dupont de Dinechin, Julien Hascoet
CNRS, LIG UMR 5217, Grenoble Alps University, F-38058 Grenoble, France
hal-01652614, (30 November 2017)


   title={Improving 3D Lattice Boltzmann Method stencil with asynchronous transfers on many-core processors},

   author={Ho, Minh-Quan and Obrecht, Christian and Tourancheau, Bernard and de Dinechin, Beno{^i}t Dupont and Hascoet, Julien},

   booktitle={36th IEEE International Performance Computing and Communications Conference (IPCCC 2017)},



Download Download (PDF)   View View   Source Source   



CPU-based many-core processors present an alternative to multicore CPU and GPU processors. In particular, the 93-Petaflops Sunway supercomputer, built from clustered many-core processors, has opened a new era for high performance computing that does not rely on GPU acceleration. However, memory bandwidth remains the main challenge for these architectures. This motivates our endeavor for optimizing one of the most data-intensive kind of stencil computations, namely the three-dimensional applications of the lattice Boltzmann method (LBM). We propose optimizations on many-cores processors by using local memory and asynchronous software-prefetching on a representative 3D LBM solver as an example. We achieve 33% performance gain on the Kalray MPPA-256 manycore processor by actively streaming data from/to local memory, compared to the "passive" OpenCL programming model.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: