hgpu.org » Streaming
Mehrzad Samadi, Amir Hormati, Mojtaba Mehrara, Janghaeng Lee, and Scott Mahlke
Tags: Compiler, CUDA, GPU, nVidia, nVidia GeForce GTX 285, Optimization, Portability, Streaming, Tesla C2050
March 31, 2012 by Moaddeli
Recent source codes
* * *
Most viewed papers (last 30 days)
- A Microbenchmark Framework for Performance Evaluation of OpenMP Target Offloading
- pyATF: Constraint-Based Auto-Tuning in Python
- TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
- WgPy: GPU-accelerated NumPy-like array library for web browsers
- CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads
- LLMPerf: GPU Performance Modeling meets Large Language Models
- The Shamrock code: I- Smoothed Particle Hydrodynamics on GPUs
- Concurrent Scheduling of High-Level Parallel Programs on Multi-GPU Systems
- TransCL: An Automatic CUDA-to-OpenCL Programs Transformation Framework
- Can Tensor Cores Benefit Memory-Bound Kernels? (No!)
* * *