Maximizing Parallelism and GPU Utilization For Direct GPU Compilation Through Ensemble Execution

hgpu.org » Applications » Computer science » Maximizing Parallelism and GPU Utilization For Direct GPU Compilation Through Ensemble Execution

Maximizing Parallelism and GPU Utilization For Direct GPU Compilation Through Ensemble Execution

Shilei Tian, Barbara Chapman, Johannes Doerfert

Stony Brook University, Stony Brook, NY, USA

Workshop on LLVM in Parallel Processing (LLPP), 2023

BibTeX

Download (PDF)

View

Source

861

views

GPUs are renowned for their exceptional computational acceleration capabilities achieved through massive parallelism. However, utilizing GPUs for computation requires manual identification of code regions suitable for offloading, data transfer management, and synchronization. Recent advancements have capitalized on the LLVM/OpenMP portable target offloading interface, elevating GPU acceleration to new heights. This approach, known as the direct GPU compilation, involves compiling the entire host application for execution on the GPU, eliminating the need for explicit offloading directives. However, direct GPU compilation is limited to the thread parallelism a CPU application exposes, which is often not enough to saturate a modern GPU. This paper explores an alternative approach to enhance parallelism by enabling ensemble execution. We introduce a proof-of-concept implementation that maps each invocation of an application on a different input to an individual team executed by the same GPU kernel. Our enhanced GPU loader can read command line arguments for different instances from a file to simplify the usability. Through extensive evaluation using four benchmarks, we observe up to 51X speedup for 64 instances. This demonstrate the effectiveness of ensemble execution in improving parallelism and optimizing GPU utilization for CPU programs compiled and executed directly on the GPU.

Tags: Benchmarking, Computer science, CUDA, LLVM, nVidia, nVidia A100, OpenMP, Performance

July 24, 2023 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org