high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures

Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures

Hanwoong Jung, Youngmin Yi, Soonhoi Ha

School of EECS, Seoul National University, Seoul, Korea

Parallel Processing and Applied Mathematics, 2011

@article{jung2011automatic,

title={Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures},

author={Jung, H. and Yi, Y. and Ha, S.},

year={2011}

}

Download (PDF)

View

Source

2866

views

Recently, general purpose GPU (GPGPU) programming has spread rapidly after CUDA was first introduced to write parallel programs in high-level languages for NVIDIA GPUs. While a GPU exploits data parallelism very effectively, task-level parallelism is exploited as a multi-threaded program on a multicore CPU. For such a heterogeneous platform that consists of a multicore CPU and GPU, in this paper, we propose an automatic code synthesis framework that takes a process network model specification as input and generates a multithreaded CUDA code. With the model based specification, one can explicitly specify both function-level and loop-level parallelism in an application and explore wide design space in mapping of function blocks and selecting the communication methods between CPU and GPU. The proposed technique is complementary to other high-level methods of CUDA programming. We have confirmed viability of our approach with several examples.

Tags: Code generation, Computer science, CUDA, Data parallelism, Heterogeneous systems, nVidia, Tesla M2050

October 30, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures

Your response

Recent source codes

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

CuTile Benchmark Suite: Performance and Productivity Tradeoffs for GPU Kernel Programming on Blackwell Architecture

Agentic Code Optimization via Compiler-LLM Cooperation

Most viewed papers (last 30 days)

Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)