hgpu.org » nVidia Ampere A2
Ruben Laso, Diego Krupitza, Sascha Hunold
Tags: Benchmarking, Computer science, CUDA, nVidia, nVidia Ampere A2, OpenMP, Package, Performance, Tesla T4
September 1, 2024 by hgpu
Ruben Laso, Diego Krupitza, Sascha Hunold
Tags: Benchmarking, Computer science, CUDA, nVidia, nVidia Ampere A2, OpenMP, performance portability, Tesla P4
February 18, 2024 by hgpu
Recent source codes
A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5
* * *
Most viewed papers (last 30 days)
- DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels
- Accelerating Scientific Research with Gemini: Case Studies and Common Techniques
- Deep Kernel Fusion for Transformers
- Improving HPC Code Generation Capability of LLMs via Online Reinforcement Learning with Real-Machine Benchmark Rewards
- SciDef: Automating Definition Extraction from Academic Literature with Large Language Models
- StitchCUDA: An Automated Multi-Agents End-to-End GPU Programing Framework with Rubric-based Agentic Reinforcement Learning
- Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations
- Inside VOLT: Designing an Open-Source GPU Compiler (Tool)
- Execution-Centric Characterization of FP8 Matrix Cores, Asynchronous Execution, and Structured Sparsity on AMD MI300A
- HetCCL: Accelerating LLM Training with Heterogeneous GPUs
* * *




