high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » On the Portability of GPU-Accelerated Applications via Automated Source-to-Source Translation

On the Portability of GPU-Accelerated Applications via Automated Source-to-Source Translation

Paul Sathre, Mark Gardner, Wu-chun Feng

Virginia Tech, Dept. of Computer Science, Blacksburg, Virginia, USA

International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia), 2019

@inproceedings{sathre2019portability,

title={On the Portability of CPU-Accelerated Applications via Automated Source-to-Source Translation},

author={Sathre, Paul and Gardner, Mark and Feng, Wu-chun},

booktitle={Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region},

pages={1–8},

year={2019},

organization={ACM}

}

Download (PDF)

View

Source

Source codes

Package:

CU2CL: A prototype CUDA-to-OpenCL source-to-source translator

2643

views

Over the past decade, accelerator-based supercomputers have grown from 0% to 42% performance share on the TOP500. Ideally, GPUaccelerated code on such systems should be "write once, run anywhere," regardless of the GPU device (or for that matter, any parallel device, e.g., CPU or FPGA). In practice, however, portability can be significantly more limited due to the sheer volume of code implemented in non-portable languages. For example, the tremendous success of CUDA, as evidenced by the vast cornucopia of CUDAaccelerated applications, makes it infeasible to manually rewrite all these applications to achieve portability. Consequently, we achieve portability by using our automated CUDA-to-OpenCL source-tosource translator called CU2CL. To demonstrate the state of the practice, we use CU2CL to automatically translate three medium-tolarge, CUDA-optimized codes to OpenCL, thus enabling the codes to run on other GPU-accelerated systems (as well as CPU- or FPGAbased systems). These automatically translated codes deliver performance portability, including as much as three-fold performance improvement, on a GPU device not supported by CUDA.

Tags: AMD FirePro S9150, ATI, Computer science, CUDA, FPGA, Intel Xeon Phi, nVidia, OpenCL, Package, performance portability, Tesla K80

March 10, 2019 by hgpu

Rating: 2.0/5. From 1 vote.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

On the Portability of GPU-Accelerated Applications via Automated Source-to-Source Translation

Package:

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

On the Portability of GPU-Accelerated Applications via Automated Source-to-Source Translation

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)