hgpu.org » nVidia L20
Kaixuan Zhang, Yunfan Cui, Shuhao Zhang, Chutong Ding, Shiyou Qian, Luping Wang, Jian Cao, Guangtao Xue, Cheng Huang, Guodong Yang, Liping Zhang
Tags: Computer science, CUDA, Heterogeneous systems, Machine learning, nVidia, nVidia A100, nVidia A40, nVidia H100, nVidia H20, nVidia H200, nVidia H800, nVidia L20, nVidia L40, nVidia RTX 6000 Ada, Performance, Triton
January 25, 2026 by hgpu
Borui Wan, Gaohong Liu, Zuquan Song, Jun Wang, Yun Zhang, Guangming Sheng, Shuguang Wang, Houmin Wei, Chenyuan Wang, Weiqiang Lou, Xi Yang, Mofan Zhang, Kaihua Jiang, Cheng Ren, Xiaoyun Zhi, Menghan Yu, Zhe Nan, Zhuolin Zheng, Baoquan Zhong, Qinlong Wang, Huan Yu, Jinxin Chi, Wang Zhang, Yuhan Li, Zixian Du, Sida Zhao, Yongqiang Zhang, Jingzhe Tang, Zherui Liu, Chuan Wu, Yanghua Peng, Haibin Lin, Wencong Xiao, Xin Liu, Liang Xiang
Tags: AI, Computer science, CUDA, LLM, nVidia, nVidia L20
September 28, 2025 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Revealing NVIDIA Closed-Source Driver Command Streams for CPU-GPU Runtime Behavior Insight
- Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs
- FACT: Compositional Kernel Synthesis with a Three-Stage Agentic Workflow
- DITRON: Distributed Multi-level Tiling Compiler for Parallel Tensor Programs
- CuBridge: An LLM-Based Framework for Understanding and Reconstructing High-Performance Attention Kernels
- ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants
- Kerncap: Automated Kernel Extraction and Isolation for AMD GPUs
- KEET: Explaining Performance of GPU Kernels Using LLM Agents
- CUDAHercules: Benchmarking Hardware-Aware Expert-level CUDA Optimization for LLMs
- A Human–Machine Collaborative Tuning Framework for Triton Kernel Optimization on SIMD Platforms
* * *



