From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels

hgpu.org » Applications » Physics » Astrophysics » From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels

From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels

Gregor Daiß, Patrick Diehl, Dominic Marcello, Alireza Kheirkhahan, Hartmut Kaiser, Dirk Pflüger

LSU Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, 70803 U.S.A

arXiv:2210.06438 [cs.DC], (26 Sep 2022)

DOI:10.48550/arXiv.2210.06438

BibTeX

Download (PDF)

View

Source

Source codes

Package:

Octo-Tiger: Astrophysics program simulating the evolution of star systems based on the fast multipole method on adaptive Octrees

1096

views

Meeting both scalability and performance portability requirements is a challenge for any HPC application, especially for adaptively refined ones. In Octo-Tiger, an astrophysics application for the simulation of stellar mergers, we approach this with existing solutions: We employ HPX to obtain fine-grained tasks to easily distribute work and finely overlap communication and computation. For the computations themselves, we use Kokkos to turn these tasks into compute kernels capable of running on hardware ranging from a few CPU cores to powerful accelerators. There is a missing link, however: while the fine-grained parallelism exposed by HPX is useful for scalability, it can hinder GPU performance when the tasks become too small to saturate the device, causing low resource utilization. To bridge this gap, we investigate multiple different GPU work aggregation strategies within Octo-Tiger, adding one new strategy, and evaluate the node-level performance impact on recent AMD and NVIDIA GPUs, achieving noticeable speedups.

Tags: AMD Radeon Instinct MI100, Astrophysics, ATI, CUDA, HIP, nVidia, nVidia A100, Package, performance portability, Physics

October 23, 2022 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org