Het Mankad, Sanil Rao, Brian Van Straalen, Phillip Colella, Franz Franchetti
Mingyu Liang, Wenyin Fu, Louis Feng, Zhongyi Lin, Pavani Panakanti, Shengbao Zheng, Srinivas Sridharan, Christina Delimitrou
Tags: AI, Benchmarking, Code generation, Computer science, CUDA, nVidia, Package, Performance, PyTorch, Tesla A100, Tesla V100
Daniel Nichols, Aniruddha Marathe, Harshitha Menon, Todd Gamblin, Abhinav Bhatele
William F. Godoy, Pedro Valero-Lara, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter
Tags: AI, Artificial intelligence, Benchmarking, Code generation, Computer science, CUDA, Fortran, HPC, Julia, nVidia, OpenACC, OpenMP, Package, Python
Kazuaki Matsumura, Simon Garcia De Gonzalo, Antonio J. Peña
Pratik Fegade, Tianqi Chen, Phillip B. Gibbons, Todd C. Mowry
Juan Fumero, György Rethy, Athanasios Stratikopoulos, Nikos Foutris, Christos Kotselidis
Shixun Wu, Yujia Zhai, Jinyang Liu, Jiajun Huang, Zizhe Jian, Bryan M. Wong, Zizhong Chen
Tags: Code generation, Computer science, CUDA, GEMM, Linear Algebra, Matrix multiplication, nVidia, nVidia A100, Package, Performance, Reliability, Tesla T4
Vsevolod Livinskii, Dmitry Babokin, John Regehr
Giorgis Georgakoudis, Konstantinos Parasyris, Chunhua Liao, David Beckingsale, Todd Gamblin, Bronis de Supinski
Tags: AMD Radeon Instinct Mi50, ATI, Benchmarking, Code generation, Compilers, Computer science, CUDA, Heterogeneous systems, Machine learning, nVidia, OpenMP, performance portability, Tesla P100, Tesla V100
Tal Ben-Nun, Berke Ates, Alexandru Calotoiu, Torsten Hoefler