high performance computing on graphics processing units: hgpu.org

hgpu.org » paper » A Survey of Techniques for Architecting and Managing GPU Register File

A Survey of Techniques for Architecting and Managing GPU Register File

Sparsh Mittal

Oak Ridge National Laboratory

IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016

@article{ref77,

title={A Survey of Techniques for Architecting and Managing GPU Register File},

year={2016},

author={Sparsh Mittal},

journal={IEEE Transactions on Parallel and Distributed Systems (TPDS)},

keywords={Review, Classification, GPGPU, GPU, Register file, Reliability, Performance, Power management, Non-volatile memory, Embedded DRAM, eDRAM}

}

Download (PDF)

View

Source

2330

views

To support their massively-multithreaded architecture, GPUs use very large register file (RF) which has a capacity higher than even L1 and L2 caches. In total contrast, traditional CPUs use tiny RF and much larger caches to optimize latency. Due to these differences, along with the crucial impact of RF in determining GPU performance, novel and intelligent techniques are required for managing GPU RF. In this paper, we survey the techniques for designing and managing GPU RF. We discuss techniques related to performance, energy and reliability aspects of RF. To emphasize the similarities and differences between the techniques, we classify them along several parameters. The aim of this paper is to synthesize the state-of-art developments in RF management and also stimulate further research in this area.

Tags: Energy efficiency, GPU, nVidia, Performance, Power, Register file, Reliability, Research, survey

March 22, 2016 by sparsh0mittal

Rating: 1.5/5. From 2 votes.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

A Survey of Techniques for Architecting and Managing GPU Register File

Your response

Recent source codes

NVIDIA Nemotron Parse 1.1

ThunderKittens: Tile primitives for speedy kernels

Iris: AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming

HipKittens: Fast and Furious AMD Kernels

Fortran xDSL dialects

mt4g: Memory Topology 4 GPUs

Falcon: GPU-Based Floating-point Adaptive Lossless Compression

CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization

pplx-garden: Perplexity open source garden for inference technology

LC Framework

Most viewed papers (last 30 days)

A Survey of Techniques for Architecting and Managing GPU Register File

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)