high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures

CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures

Yongpeng Zhang, Frank Mueller

North Carolina State University, Raleigh, NC, USA

International Conference on Parallel Processing (ICPP’12), 2012

@article{zhang2012cunesl,

title={CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures},

author={Zhang, Y. and Mueller, F.},

year={2012}

}

Download (PDF)

View

Source

2042

views

Data-parallel languages feature fine-grained parallel primitives that can be supported by compilers targeting modern many-core architectures where data parallelism must be exploited to fully utilize the hardware. Previous research has focused on converting data-parallel languages for SIMD (single instruction multiple data) architectures. However, directly applying them to today’s SIMT (single instruction multiple thread) architectures does not guarantee competitive performance. We propose cuNesl, a compiler framework to translate and optimize NESL into parallel CUDA programs for SIMT architectures. By converting recursive calls into while loops, we ensure that the hierarchical execution model in GPUs can be exploited on the "flattened" code. The performance gap between our auto-generated CUDA code and hand-crafted CUDA code thus narrows while programmability is greatly increased. Our compiler outperforms handwritten parallel code running on CPUs in terms of both execution time and programmability.

Tags: Computer science, CUDA, Data parallelism, nVidia, nVidia GeForce GTX 480

July 17, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels

KernelGYM & Dr. Kernel: A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

high performance computing on graphics processing units: hgpu.org

CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures

Your response

Recent source codes

DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels

KernelGYM & Dr. Kernel: A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations

Vortex-Optimized Light-weight Toolchain (VOLT)

SciDef: Automated Definition Extraction from Scientific Literature

bioagent-bench: Benchmark for evaluating LLM agents in bioinformatics

Benchmark suite for LLM inference on NVIDIA consumer GPUs

Theorizer: from the paper Generating Literature-Driven Scientific Discoveries at Scale

Nsight Python: a Python kernel profiling interface based on NVIDIA Nsight Tools

Awesome LLM-Driven Kernel Generation

PhysProver: Advancing Automatic Theorem Proving for Physics

Most viewed papers (last 30 days)

CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)