Patricia Siwinska,Jie Lei,Adrian Castello,Pedro Alonso-Jord́a,Enrique S. Quintana-Orti
Zhongzhen Wen, Hongyu Liu, Tingwei Zhu, Minxue Pan, Shaohua Wang, Yuanyi Lin, Kairui Liu, Tian Zhang, Xuandong Li
Leonardo Solis-Vasquez, Andreas F. Tillack, Diogo Santos-Martins, Andreas Koch, Stefano Forli
Yifan Zhao, Egan Johnson, Prasanth Chatarasi, Vikram Adve, Sasa Misailovic
Tags: AMD Radeon Instinct MI300X, ATI, Computer science, CUDA, Deep learning, nVidia, nVidia A100, nVidia RTX A5000, nVidia RTX A6000, Package, Performance, ROCm
Lingcheng Kong, Jiateng Wei, Hanzhang Shen, Huan Wang
Ping Guo, Chenyu Zhu, Siyuan Chen, Fei Liu, Xi Lin, Zhichao Lu, Qingfu Zhang
Yongin Kwon, Joohyoung Cha, Sehyeon Oh, Misun Yu, Jeman Park, Jemin Lee
Jianghui Wang, Vinay Joshi, Saptarshi Majumder, Xu Chao, Bin Ding, Ziqiong Liu, Pratik Prabhanjan Brahma, Dong Li, Zicheng Liu, Emad Barsoum
Mohammad Firas Sada, John J. Graham, Elham E Khoda, Mahidhar Tatineni, Dmitry Mishin, Rajesh K. Gupta, Rick Wagner, Larry Smarr, Thomas A. DeFanti, Frank Würthwein
Jinliang Shi, Shigang Li, Youxuan Xu, Xueying Wang, Rongtian Fu, Zhi Ma, Tong Wu
Jiaqi Lv, Xufeng He, Yanchen Liu, Xu Dai, Yang Hu, Shouyi Yin
Tags: AI, Benchmarking, Compilers, Computer science, CUDA, Deep learning, LLM, nVidia, nVidia A100, Package, performance portability