high performance computing on graphics processing units: hgpu.org

hgpu.org » Operating systems

Creating HW/SW co-designed MPSoPC’s from high level programming models

Eugene Cartwright, Sen Ma, David Andrews, Miaoqing Huang

View

Download (PDF)

Tags: Computer science, FPGA, Heterogeneous systems, OpenCL, Operating systems, Pthreads

November 12, 2011 by hgpu

Performance analysis of a hybrid MPI/CUDA implementation of the NASLU benchmark

S.J. Pennycook, S.D. Hammond, S.A. Jarvis, G.R. Mudalige

View

Download (PDF)

Tags: Benchmarking, Computer science, CUDA, MPI, nVidia, nVidia GeForce 8400 GS, nVidia GeForce 9400 GT, Operating systems, Performance, Tesla C1060, Tesla C2050, Tesla T10

November 8, 2011 by hgpu

A shared file system abstraction for heterogeneous architectures

Mark Silberstein, Idit Keidar

View

Download (PDF)

Tags: Computer science, Heterogeneous systems, Operating systems

November 2, 2011 by hgpu

The MOSIX Virtual OpenCL (VCL) Cluster Platform

Amnon Barak, Amnon Shiloh

View

Download (PDF)

Source codes

Tags: APU, Computer science, GPU cluster, Heterogeneous systems, MPI, nVidia, nVidia GeForce GTX 480, OpenCL, Operating systems, Package

October 24, 2011 by hgpu

Efficient Synchronization Primitives for GPUs

Jeff A. Stuart, John D. Owens

View

Download (PDF)

Tags: Algorithms, Benchmarking, Computer science, CUDA, Data Structures and Algorithms, nVidia, nVidia GeForce GTX 295, nVidia GeForce GTX 580, Operating systems

October 21, 2011 by hgpu

Operating Systems Challenges for GPU Resource Management

Shinpei Kato, Scott Brandt, Yutaka Ishikawa, Ragunathan (Raj) Rajkumar

View

Download (PDF)

Tags: Computer science, CUDA, GPU cluster, nVidia, OpenCL, Operating systems, Virtualization

October 15, 2011 by hgpu

PTask: Operating System Abstractions To Manage GPUs as Compute Devices

Christopher J. Rossbach, Jon Currey, Mark Silberstein, Baishakhi Ray, Emmett Witchel

View

Download (PDF)

Tags: Computer science, CUDA, HLSL, nVidia, nVidia GeForce GT 230, nVidia GeForce GTX 470, nVidia GeForce GTX 580, OpenCL, Operating systems, Performance, Programming techniques

October 2, 2011 by hgpu

Real-Time Handling of GPU Interrupts in LITMUSRT

Glenn A. Elliott, Chih-Hao Sun, and James H. Anderson

View

Download (PDF)

Tags: Algorithms, Computer science, CUDA, nVidia, nVidia GeForce GTX 470, Operating systems

September 30, 2011 by hgpu

Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

Vignesh T. Ravi, Michela Becchi, Gagan Agrawal, Srimat Chakradhar

View

Download (PDF)

Tags: Algorithms, Cloud, Computer science, CUDA, nVidia, Operating systems, Performance, Tesla C2050, Virtualization

September 20, 2011 by hgpu

Performing with CUDA

William B. Langdon

View

Download (PDF)

Tags: Computer science, CUDA, nVidia, Operating systems, Performance, Review, Software Engineering, Tutorial

September 15, 2011 by hgpu

The case for VOS: the vector operating system

Vijay Vasudevan, David G. Andersen, Michael Kaminsky

View

Download (PDF)

Tags: Computer science, Operating systems

September 14, 2011 by hgpu

Optimizing a shared virtual memory system for a heterogeneous CPU-accelerator platform

Shoumeng Yan, Xiaocheng Zhou, Ying Gao, Hu Chen, Gansha Wu, Sai Luo, Bratin Saha

Tags: Computer science, Heterogeneous systems, Memory, Operating systems, Performance, Programming Languages

September 12, 2011 by hgpu

CUDAnalyst (CUDA + Analyst)

Towards Feedback-to-Plan Decisions for Self-Evolving LLM Agents in CUDA Kernel Generation

CodegenBench

CodegenBench: Can LLMs Write Efficient Code Across Architectures?

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CuTile Benchmark Suite: Performance and Productivity Tradeoffs for GPU Kernel Programming on Blackwell Architecture

Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs

Agentic Code Optimization via Compiler-LLM Cooperation

Device Virtual Machine (DVM)

DVM: Real-Time Kernel Generation for Dynamic AI Models

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Creating HW/SW co-designed MPSoPC’s from high level programming models

Performance analysis of a hybrid MPI/CUDA implementation of the NASLU benchmark

A shared file system abstraction for heterogeneous architectures

The MOSIX Virtual OpenCL (VCL) Cluster Platform

Efficient Synchronization Primitives for GPUs

Operating Systems Challenges for GPU Resource Management

PTask: Operating System Abstractions To Manage GPUs as Compute Devices

Real-Time Handling of GPU Interrupts in LITMUSRT

Performing with CUDA

The case for VOS: the vector operating system

Optimizing a shared virtual memory system for a heterogeneous CPU-accelerator platform

Recent source codes

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

CuTile Benchmark Suite: Performance and Productivity Tradeoffs for GPU Kernel Programming on Blackwell Architecture

Agentic Code Optimization via Compiler-LLM Cooperation

Device Virtual Machine (DVM)

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

Most viewed papers (last 30 days)