Haohui Mai, Xiaoyan Guo, Xiangyun Ding, Daifeng Li, Qiuchu Yu, Chenzhun Guo, Cong Wang, Jiacheng Zhao, Christos Kozyrakis, Binhang Yuan
Sina Heidari, Dimitrios S. Nikolopoulos
Yuang Yan, Ian Karlin, Ryan Grant
Xulin Zhou, Hongbin Zhang, Mingjie Xing
Divakar Kumar Yadav, Tian Zhao, Deepak Kumar
Tags: Computer science, CUBLAS, CUDA, LLM, nVidia, nVidia B200, nVidia H100, nVidia RTX PRO 6000, Package, Performance, Triton
Jingzhi Fang, Xiong Gao, Renwei Zhang, Zichun Ye, Lei Chen, Jie Zhao, Chengnuo Huang, Hui Xu, Xuefeng Jin
Benjamin Mikek, Danylo Vashchilenko, Bryan Lu, Panpan Xu
Tara Saba, Anne Ouyang, Xujie Si, Fan Long
He Du, Qiming Ge, Jiakai Hu, Aijun Yang, Zheng Cai, Zixian Huang, Sheng Yuan, Qinxiu Cheng, Xinchen Xie, Yicheng Chen, Yining Li, Jiaxing Xie, Huanan Dong, Yaguang Wu, Xiangjun Huang, Jian Yang, Hui Wang, Bowen Zhou, Bowen Li, Qipeng Guo, Kai Chen
Zhengqing Yuan, Hanchi Sun, Lichao Sun, Yanfang Ye
Siqi Guo, Ming Lin, Tianbao Yang
Ilias K. Kasmeridis, Vassilios V. Dimakopoulos