Ronald M. Caplan, Miko M. Stulajter, Jon A. Linker, Jeff Larkin, Henry A. Gabb, Shiquan Su, Ivan Rodriguez, Zachary Tschirhart, Nicholas Malaya
Tags: Computer science, Fortran, Intel, Intel Data Center GPU Max 1550, Intel Ponte Vecchio Max 1100, nVidia, nVidia A100, nVidia GH200, nVidia H100, OpenACC, OpenMP, Package
Mert Hidayetoglu, Simon Garcia de Gonzalo, Elliott Slaughter, Pinku Surana, Wen-mei Hwu, William Gropp, Alex Aiken
Sungho Lee, Marco Martínez-Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Giorgio Fabbro, Kyogu Lee, Yuki Mitsufuji
Qipeng Wang, Shiqi Jiang, Zhenpeng Chen, Xu Cao, Yuanchun Li, Aoyu Li, Yun Ma, Ting Cao, Xuanzhe Liu
Tags: Computer science, CUDA, Deep learning, nVidia, nVidia GeForce GTX 1060, nVidia GeForce GTX 980, nVidia GeForce RTX 2060, OpenCL, Package, Performance, TensorFlow
Yi Ju, Mingshuai Li, Adalberto Perez, Laura Bellentani, Niclas Jansson, Stefano Markidis, Philipp Schlatter, Erwin Laure
Junqing Lin, Jingwei Sun, Xiaolong Shi, Honghe Zhang, Xianzhi Yu, Xinzhi Wang, Jun Yao, Guangzhong Sun
Tags: Compilers, Computer science, CUDA, Deep learning, Linear Algebra, Matrix multiplication, Neural networks, nVidia, nVidia GeForce RTX 2080 Ti, Performance, Sparse matrix, Tesla V100
Gabriele Mencagli, Patrizio Dazzi, Massimo Coppola
Seonho Lee, Amar Phanishayee, Divya Mahajan
Tags: Computer science, CUDA, Deep learning, nVidia, nVidia A100, nVidia H100, nVidia P100, nVidia V100, Performance, PyTorch, Tesla T4
Yizhou Luo, Qiang Wang, Shaohuai Shi, Jiaxin Lai, Shuhan Qi, Jiajia Zhang, Xuan Wang