Joshua H. Davis, Klaudiusz Rydzy, Srinivasan Ramesh, Aadit Nilay, Daniel Nichols, Swapna Raj, Nikhil Jain, Abhinav Bhatele
Xing Ma, Yangjie Zhou, Wu Sun, Zihan Liu, Jingwen Leng, Yun Lin, Shixuan Sun, Minyi Guo, Jin Song Dong
Aaron Jarmusch, Sunita Chandrasekaran
Size Zheng, Xuegui Zheng, Hanshi Sun, Qi Hou, Wenlei Bao, Shiyu Li, Haojie Duanmu, Jin Fang, Chenli Xue, Chenhui Huang, Yuanqiang Liu, Renze Chen, Ningxin Zheng, Dongyang Wang, Li-Wen Chang, Liqiang Lu, Yun Liang, Jidong Zhai, Xin Liu
Haohui Mai, Xiaoyan Guo, Xiangyun Ding, Daifeng Li, Qiuchu Yu, Chenzhun Guo, Cong Wang, Jiacheng Zhao, Christos Kozyrakis, Binhang Yuan
Sina Heidari, Dimitrios S. Nikolopoulos
Yuang Yan, Ian Karlin, Ryan Grant
Xulin Zhou, Hongbin Zhang, Mingjie Xing
Divakar Kumar Yadav, Tian Zhao, Deepak Kumar
Tags: Computer science, CUBLAS, CUDA, LLM, nVidia, nVidia B200, nVidia H100, nVidia RTX PRO 6000, Package, Performance, Triton
Jingzhi Fang, Xiong Gao, Renwei Zhang, Zichun Ye, Lei Chen, Jie Zhao, Chengnuo Huang, Hui Xu, Xuefeng Jin
Benjamin Mikek, Danylo Vashchilenko, Bryan Lu, Panpan Xu