hgpu.org » Intel Gaudi-2
Yunjae Lee, Juntaek Lim, Jehyeon Bang, Eunyeong Cho, Huijong Jeong, Taesu Kim, Hyungjun Kim, Joonhyung Lee, Jinseop Im, Ranggi Hwang, Se Jung Kwon, Dongsoo Lee, Minsoo Rhu
Tags: AI, Benchmarking, Computer science, CUDA, Intel, Intel Gaudi-2, nVidia, nVidia A100, Performance
January 6, 2025 by hgpu
Recent source codes
RepoLaunch: Automating Build and Test Pipeline of Code Repositories on ANY Language and ANY Platform
RepoLaunch: Automating Build and Test Pipeline of Code Repositories on ANY Language and ANY Platform
* * *
Most viewed papers (last 30 days)
- DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels
- Deep Kernel Fusion for Transformers
- StitchCUDA: An Automated Multi-Agents End-to-End GPU Programing Framework with Rubric-based Agentic Reinforcement Learning
- Improving HPC Code Generation Capability of LLMs via Online Reinforcement Learning with Real-Machine Benchmark Rewards
- Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4
- Execution-Centric Characterization of FP8 Matrix Cores, Asynchronous Execution, and Structured Sparsity on AMD MI300A
- CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
- Catalyst-Agent: Autonomous heterogeneous catalyst screening and optimization with an LLM Agent
- Joint Training on AMD and NVIDIA GPUs
- A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5
* * *



