hgpu.org » nVidia L20
Borui Wan, Gaohong Liu, Zuquan Song, Jun Wang, Yun Zhang, Guangming Sheng, Shuguang Wang, Houmin Wei, Chenyuan Wang, Weiqiang Lou, Xi Yang, Mofan Zhang, Kaihua Jiang, Cheng Ren, Xiaoyun Zhi, Menghan Yu, Zhe Nan, Zhuolin Zheng, Baoquan Zhong, Qinlong Wang, Huan Yu, Jinxin Chi, Wang Zhang, Yuhan Li, Zixian Du, Sida Zhao, Yongqiang Zhang, Jingzhe Tang, Zherui Liu, Chuan Wu, Yanghua Peng, Haibin Lin, Wencong Xiao, Xin Liu, Liang Xiang
Tags: AI, Computer science, CUDA, LLM, nVidia, nVidia L20
September 28, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Compiler and Runtime Systems for Generative AI Models
- Scalable GPU-Based Integrity Verification for Large Machine Learning Models
- STARK: Strategic Team of Agents for Refining Kernels
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
- Tutoring LLM into a Better CUDA Optimizer
- INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
- Neptune: Advanced ML Operator Fusion for Locality and Parallelism on GPUs
- Adaptivity in AdaptiveCpp: Optimizing Performance by Leveraging Runtime Information During JIT-Compilation
- Collective Communication for 100k+ GPUs
- Enhancing Transformer Performance and Portability through Auto-tuning Frameworks
* * *



