hgpu.org » Exa.TrkX
Xiangyang Ju, Daniel Murnane, Paolo Calafiura, Nicholas Choma, Sean Conlon, Steve Farrell, Yaoyuan Xu, Maria Spiropulu, Jean-Roch Vlimant, Adam Aurisano, Jeremy Hewes, Giuseppe Cerati, Lindsey Gray, Thomas Klijnsma, Jim Kowalkowski, Markus Atkinson, Mark Neubauer, Gage DeZoort, Savannah Thais, Aditi Chauhan, Alex Schuy, Shih-Chieh Hsu, Alex Ballow, Alina Lazar
Tags: Algorithms, CUDA, Deep learning, Exa.TrkX, HEP, Neural networks, nVidia, Package, Physics, Tesla A100, Tesla V100
March 21, 2021 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study
- LLMQ: Efficient Lower-Precision LLM Training for Consumer GPUs
- AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search
- An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU
- DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation
- KernelFoundry: Hardware-aware evolutionary GPU kernel optimization
- MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?
- CuTeGen: An LLM-Based Agentic Framework for Generation and Optimization of High-Performance GPU Kernels using CuTe
- Mixed-precision numerics in scientific applications: survey and perspectives
- True 4-Bit Quantized Convolutional Neural Network Training on CPU: Achieving Full-Precision Parity
* * *




