hgpu.org » Apple M2 Max
Dahua Feng, Zhiming Xu, Rongxiang Wang, Felix Xiaozhu Lin
Tags: AI, Apple M2 Max, Apple M2 Pro, Apple M2 Ultra, Computer science, CUDA, Linear Algebra, LLM, Machine learning, nVidia, nVidia GeForce RTX 4090, nVidia GeFroce RTX 2080 Ti, nVidia Quadro RTX 4000, nVidia RTX A6000, Performance, PyTorch
February 3, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Data-efficient LLM Fine-tuning for Code Generation
- LithOS: An Operating System for Efficient Machine Learning on GPUs
- Dynamic Memory Management on GPUs with SYCL
- LIFT: LLM-Based Pragma Insertion for HLS via GNN Supervised Fine-Tuning
- MSCCL++: Rethinking GPU Communication Abstractions for Cutting-edge AI Applications
- Efficient deep learning inference on end devices
- DeepCompile: A Compiler-Driven Approach to Optimizing Distributed Deep Learning Training
- InteropUnityCUDA: A Tool for Interoperability Between Unity and CUDA
- Mìmir: A real-time interactive visualization library for CUDA programs
- Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration
* * *