high performance computing on graphics processing units: hgpu.org

Larry Seiler, Doug Carmean, Eric Sprangle, Tom Forsyth, Michael Abrash, Pradeep Dubey, Stephen Junkins, Adam Lake, Jeremy Sugerman, Robert Cavin, Roger Espasa, Ed Grochowski, Toni Juan, Pat Hanrahan

View

Download (PDF)

Tags: 3D Graphics and Realism, Architecture, Computer science, Hardware, Larrabee

December 10, 2010 by hgpu

Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor

Henry Wong, Anne Bracy, Ethan Schuchman, Tor M. Aamodt, Jamison D. Collins, Perry H. Wang, Gautham Chinya, Ankur K. Groen, Hong Jiang, Hong Wang

View

Download (PDF)

Tags: Computer science, FPGA, Hardware

December 7, 2010 by hgpu

GPU architecture overview

John Owens

View

Download (PDF)

Tags: Computer science, Hardware, Presentation, Review

December 5, 2010 by hgpu

Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow

Wilson W. L. Fung, Ivan Sham, George Yuan, Tor M. Aamodt

View

Download (PDF)

Tags: Computer science, Hardware, Performance

December 4, 2010 by hgpu

The visual vulnerability spectrum: characterizing architectural vulnerability for graphics hardware

Jeremy W. Sheaffer, David P. Luebke, Kevin Skadron

View

Download (PDF)

Tags: Computer science, Hardware, nVidia, OpenGL

December 2, 2010 by hgpu

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers

OpenMC Monte Carlo Code

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs

Polygeist: C/C++ frontend for MLIR

Retargeting and Respecializing GPU Workloads for Performance Portability

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

SYCL in the edge: performance and energy evaluation for heterogeneous acceleration

OpenMP5-Offload-OpenMC-Intel-PVC

Distributed OpenMP Offloading of OpenMC on Intel GPU MAX Accelerators

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

GPGPU Performance Estimation with Core and Memory Frequency Scaling

Hardware thread reordering to boost OpenCL throughput on FPGAs

A Survey of Recent Prefetching Techniques for Processor Caches

A Survey Of Techniques for Managing and Leveraging Caches in GPUs

State of The Art Report on GPU

GPU-based parallelization for fast circuit optimization

Future graphics architectures

Larrabee: a many-core x86 architecture for visual computing

Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor

GPU architecture overview

Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow

The visual vulnerability spectrum: characterizing architectural vulnerability for graphics hardware

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)