high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » An Environment to Support GPU and Multicore Programming for Rapid, High Performance, Application Deployment

An Environment to Support GPU and Multicore Programming for Rapid, High Performance, Application Deployment

James Laurence Brock

The Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts

Northeastern University, 2012

@phdthesis{brock2012environment,

title={An Environment to Support GPU and Multicore Programming for Rapid, High Performance, Application Deployment},

author={Brock, J.L.},

school={NORTHEASTERN UNIVERSITY},

year={2012}

}

Download (PDF)

View

Source

2864

views

Homogeneous multicore processors, heterogeneous multicore processors, high performance accelerators, and other heterogeneous architectures have significant computing potential over traditional single core processors. Computer systems comprised of these specialized processing elements are increasingly common. Due to the increased complexity of these architectures, programming for them has become increasingly complex and error prone. Each of these architectures have different memory systems, programming languages and development environments. This has driven the need for portable programming APIs and tools that allow developers to easily exploit all of the computational power of these platforms and effortlessly move their programs between different computing systems. To deal with these challenges MIT Lincoln Laboratory developed the Parallel Vector Tile Optimizing Library (PVTOL) to simplify the task of portable programming for complex systems. The PVTOL Tasks and Conduits framework provides a set of high-level programming constructs for writing high performance code that is portable across a range of traditional and heterogeneous architectures. This research extends PVTOL to include support for Graphics Processing Units (GPUs) and heterogeneous computing architectures using both the NVIDIA Compute Unified Device Architecture (CUDA) and Open Compute Language (OpenCL), while maintaining simplicity of programming and portability. We have demonstrated the utility of this framework by porting both a quantum Monte Carlo simulation and 3D cone beam image reconstruction application to different systems consisting of various heterogeneous architectures. These applications have been ported from single CPU/GPU systems up to heterogeneous cluster architectures with as many as 24 nodes containing GPUs, showing significant speed up and scalability with minimal devleper effort. Using this framework, we have achieved total application run time speed ups of quantum Monte Carlo simulations of 115x on 24 distributed GPU nodes and speed ups of 3D cone beam image reconstruction of 315x on 16 distributed GPU nodes compared to multithreaded C code.

Tags: ATI, ATI Radeon HD 5870, Computer science, CUDA, Heterogeneous systems, Image reconstruction, Monte Carlo simulation, nVidia, nVidia GeForce 9800 GX2, nVidia GeForce GTX 560 Ti, OpenCL, Tesla C1060, Tesla S1070, Thesis

October 26, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

An Environment to Support GPU and Multicore Programming for Rapid, High Performance, Application Deployment

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

An Environment to Support GPU and Multicore Programming for Rapid, High Performance, Application Deployment

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)