high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation

Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation

Ziyu Guo, Xipeng Shen

College of William and Mary, Williamsburg VA 23187, USA

International Workshop on Languages and Compilers for Parallel Computing. (LCPC 2011), 2011

@inproceedings{guo2011fine,

title={Fine-grained treatment to synchronizations in gpu-to-cpu translation},

author={Guo, Z. and Shen, X.},

booktitle={Proc. of the Workshop on Languages and Compilers for Parallel Computing},

year={2011}

}

Download (PDF)

View

Source

1754

views

GPU-to-CPU translation may extend Graphics Processing Units (GPU) programs executions to multi-/many-core CPUs, and hence enable cross-device task migration and promote whole-system synergy. This paper describes some of our findings in treatment to GPU synchronizations during the translation process. We show that careful dependence analysis may allow a fine-grained treatment to synchronizations and reveal redundant computation at the instruction-instance level. Based on thread-level dependence graphs, we present a method to enable such fine-grained treatment automatically. Experiments demonstrate that compared to existing translations, the new approach can yield speedup of a factor of integers.

Tags: Code generation, Compilers, Computer science, CUDA, nVidia

April 4, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation

Your response

Recent source codes

True 4-Bit Quantized CNN Training on CPU

cuFuzz: A GPU-oriented coverage-guided fuzzer for userland CUDA application

KernelSkill: A Multi-Agent Framework for GPU Kernel Optimization

MSKernelBench & CUDAMaster

EvoScientist: Harness Vibe Research with Self-evolving AI Scientists

RepoLaunch: Automating Build and Test Pipeline of Code Repositories on ANY Language and ANY Platform

RepoLaunch: Automating Build and Test Pipeline of Code Repositories on ANY Language and ANY Platform

CONCUR: a benchmark designed to evaluate multithreaded Java code generated by LLMs

HIPRT: Ray Tracing using HIP

MXFP4 Training Support Codebase

Most viewed papers (last 30 days)

Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)