hgpu.org » Apple M2 Pro
Dahua Feng, Zhiming Xu, Rongxiang Wang, Felix Xiaozhu Lin
Tags: AI, Apple M2 Max, Apple M2 Pro, Apple M2 Ultra, Computer science, CUDA, Linear Algebra, LLM, Machine learning, nVidia, nVidia GeForce RTX 4090, nVidia GeFroce RTX 2080 Ti, nVidia Quadro RTX 4000, nVidia RTX A6000, Performance, PyTorch
February 3, 2025 by hgpu
Recent source codes
RepoLaunch: Automating Build and Test Pipeline of Code Repositories on ANY Language and ANY Platform
RepoLaunch: Automating Build and Test Pipeline of Code Repositories on ANY Language and ANY Platform
* * *
Most viewed papers (last 30 days)
- StitchCUDA: An Automated Multi-Agents End-to-End GPU Programing Framework with Rubric-based Agentic Reinforcement Learning
- Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4
- CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
- Catalyst-Agent: Autonomous heterogeneous catalyst screening and optimization with an LLM Agent
- Joint Training on AMD and NVIDIA GPUs
- A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5
- Fine-Tuning GPT-5 for GPU Kernel Generation
- Practical FP4 Training for Large-Scale MoE Models on Hopper GPUs
- CUDABench: Benchmarking LLMs for Text-to-CUDA Generation
- CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models
* * *



