hgpu.org » Apple M2 Pro
Dahua Feng, Zhiming Xu, Rongxiang Wang, Felix Xiaozhu Lin
Tags: AI, Apple M2 Max, Apple M2 Pro, Apple M2 Ultra, Computer science, CUDA, Linear Algebra, LLM, Machine learning, nVidia, nVidia GeForce RTX 4090, nVidia GeFroce RTX 2080 Ti, nVidia Quadro RTX 4000, nVidia RTX A6000, Performance, PyTorch
February 3, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- CuTeGen: An LLM-Based Agentic Framework for Generation and Optimization of High-Performance GPU Kernels using CuTe
- MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU
- Agentic Code Optimization via Compiler-LLM Cooperation
- DVM: Real-Time Kernel Generation for Dynamic AI Models
- Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization
* * *



