hgpu.org » Streaming
Mehrzad Samadi, Amir Hormati, Mojtaba Mehrara, Janghaeng Lee, and Scott Mahlke
Tags: Compiler, CUDA, GPU, nVidia, nVidia GeForce GTX 285, Optimization, Portability, Streaming, Tesla C2050
March 31, 2012 by Moaddeli
Recent source codes
* * *
Most viewed papers (last 30 days)
- Block: Balancing Load in LLM Serving with Context, Knowledge and Predictive Scheduling
- Luthier: Bridging Auto-Tuning and Vendor Libraries for Efficient Deep Learning Inference
- The Fused Kernel Library: A C++ API to Develop Highly-Efficient GPU Libraries
- GPUHammer: Rowhammer Attacks on GPU Memories are Practical
- Bandicoot: A Templated C++ Library for GPU Linear Algebra
- Towards Efficient and Practical GPU Multitasking in the Era of LLM
- Dissecting CPU-GPU Unified Physical Memory on AMD MI300A APUs
- Accelerating a Linear Programming Algorithm on AMD GPUs
- GPU-acceleration of the Discontinuous Galerkin Shallow Water Equations Solver (DG-SWEM) using CUDA and OpenACC
- CrossTL: A Universal Programming Language Translator with Unified Intermediate Representation
* * *