high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » CU2CL: A CUDA-to-OpenCL Translator for Multi-and Many-core Architectures

CU2CL: A CUDA-to-OpenCL Translator for Multi-and Many-core Architectures

Gabriel Martinez, Wu-chun Feng, Mark Gardner

Department of Computer Science, Virginia Tech, USA

Technical Report TR-11-13, Computer Science, Virginia Tech., 2011

@article{martinez2011cu2cl,

title={CU2CL: A CUDA-to-OpenCL Translator for Multi-and Many-core Architectures},

author={Martinez, G. and Feng, W. and Gardner, M.},

year={2011}

}

Download (PDF)

View

Source

3532

views

The use of graphics processing units (GPUs) in high-performance parallel computing continues to become more prevalent, often as part of a heterogeneous system. For years, CUDA has been the de facto programming environment for nearly all general-purpose GPU (GPGPU) applications. In spite of this, the framework is available only on NVIDIA GPUs, traditionally requiring reimplementation in other frameworks in order to utilize additional multi- or many-core devices. On the other hand, OpenCL provides an open and vendorneutral programming environment and runtime system. With implementations available for CPUs, GPUs, and other types of accelerators, OpenCL therefore holds the promise of a "write once, run anywhere" ecosystem for heterogeneous computing. Given the many similarities between CUDA and OpenCL, manually porting a CUDA application to OpenCL is typically straightforward, albeit tedious and error-prone. In response to this issue, we created CU2CL, an automated CUDA-to- OpenCL source-to-source translator that possesses a novel design and clever reuse of the Clang compiler framework. Currently, the CU2CL translator covers the primary constructs found in CUDA runtime API, and we have successfully translated many applications from the CUDA SDK and Rodinia benchmark suite. The performance of our automatically translated applications via CU2CL is on par with their manually ported countparts.

Tags: ATI, ATI Radeon HD 5870, Benchmarking, Code generation, Computer science, CUDA, Heterogeneous systems, MPI, nVidia, nVidia GeForce GTX 280, OpenCL

September 24, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

CU2CL: A CUDA-to-OpenCL Translator for Multi-and Many-core Architectures

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

CU2CL: A CUDA-to-OpenCL Translator for Multi-and Many-core Architectures

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)