high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » CUDA » Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System

Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System

Alan Humphrey, Qingyu Meng, Martin Berzins, Todd Harman

Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, UT 84112 USA

SCI Technical Report, No. UUSCI-2012-003, SCI Institute, University of Utah, 2012

@TechReport{SCI:Hum2012a,

author={"A.HumphreyandQ.MengandM.BerzinsandT.Harman"},

title={"RadiationModelingUsingtheUintahHeterogeneousCPU/GPURuntimeSystem"},

number={"UUSCI-2012-003"},

year={2012},

institution={"SCIInstitute},

type={"SCITechnicalReport"},

keywords={"csafe},

url={"http://www.sci.utah.edu/publications/SCITechReports/UUSCI-2012-003.pdf"}

}

Download (PDF)

View

Source

2390

views

The Uintah Computational Framework was developed to provide an environment for solving fluid-structure interaction problems on structured adaptive grids on large-scale, long-running, data-intensive problems. Uintah uses a combination of fluid-flow solvers and particle-based methods for solids, together with a novel asynchronous task-based approach with fully automated load balancing. Uintah demonstrates excellent weak and strong scalability at full machine capacity on XSEDE resources such as Ranger and Kraken, and through the use of a hybrid memory approach based on a combination of MPI and Pthreads, Uintah now runs on up to 262k cores on the DOE Jaguar system. In order to extend Uintah to heterogeneous systems, with ever-increasing CPU core counts and additional onnode GPUs, a new dynamic CPU-GPU task scheduler is designed and evaluated in this study. This new scheduler enables Uintah to fully exploit these architectures with support for asynchronous, outof-order scheduling of both CPU and GPU computational tasks. A new runtime system has also been implemented with an added multi-stage queuing architecture for efficient scheduling of CPU and GPU tasks. This new runtime system automatically handles the details of asynchronous memory copies to and from the GPU and introduces a novel method of pre-fetching and preparing GPU memory prior to GPU task execution. In this study this new design is examined in the context of a developing, hierarchical GPUbased ray tracing radiation transport model that provides Uintah with additional capabilities for heat transfer and electromagnetic wave propagation. The capabilities of this new scheduler design are tested by running at large scale on the modern heterogeneous systems, Keeneland and TitanDev, with up to 360 and 960 GPUs respectively. On these systems, we demonstrate significant speedups per GPU against a standard CPU core for our radiation problem.

Tags: CUDA, Fluid dynamics, GPU cluster, Heterogeneous systems, MPI, nVidia, Partial differential equations, Pthreads, Task scheduling, Tesla M2090

May 11, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)