hgpu.org » Exa.TrkX
Xiangyang Ju, Daniel Murnane, Paolo Calafiura, Nicholas Choma, Sean Conlon, Steve Farrell, Yaoyuan Xu, Maria Spiropulu, Jean-Roch Vlimant, Adam Aurisano, Jeremy Hewes, Giuseppe Cerati, Lindsey Gray, Thomas Klijnsma, Jim Kowalkowski, Markus Atkinson, Mark Neubauer, Gage DeZoort, Savannah Thais, Aditi Chauhan, Alex Schuy, Shih-Chieh Hsu, Alex Ballow, Alina Lazar
Tags: Algorithms, CUDA, Deep learning, Exa.TrkX, HEP, Neural networks, nVidia, Package, Physics, Tesla A100, Tesla V100
March 21, 2021 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
- Scalable GPU-Based Integrity Verification for Large Machine Learning Models
- STARK: Strategic Team of Agents for Refining Kernels
- Tutoring LLM into a Better CUDA Optimizer
- INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
- Collective Communication for 100k+ GPUs
- An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
- Enhancing Transformer Performance and Portability through Auto-tuning Frameworks
- RDMA Point-to-Point Communication for LLM Systems
- A Study of Floating-Point Precision Tuning in Deep Learning Operators Implementations
* * *




