Mert Hidayetoglu, Simon Garcia de Gonzalo, Elliott Slaughter, Pinku Surana, Wen-mei Hwu, William Gropp, Alex Aiken
Qipeng Wang, Shiqi Jiang, Zhenpeng Chen, Xu Cao, Yuanchun Li, Aoyu Li, Yun Ma, Ting Cao, Xuanzhe Liu
Tags: Computer science, CUDA, Deep learning, nVidia, nVidia GeForce GTX 1060, nVidia GeForce GTX 980, nVidia GeForce RTX 2060, OpenCL, Package, Performance, TensorFlow
Yi Ju, Mingshuai Li, Adalberto Perez, Laura Bellentani, Niclas Jansson, Stefano Markidis, Philipp Schlatter, Erwin Laure
Junqing Lin, Jingwei Sun, Xiaolong Shi, Honghe Zhang, Xianzhi Yu, Xinzhi Wang, Jun Yao, Guangzhong Sun
Tags: Compilers, Computer science, CUDA, Deep learning, Linear Algebra, Matrix multiplication, Neural networks, nVidia, nVidia GeForce RTX 2080 Ti, Performance, Sparse matrix, Tesla V100
Gabriele Mencagli, Patrizio Dazzi, Massimo Coppola
Seonho Lee, Amar Phanishayee, Divya Mahajan
Tags: Computer science, CUDA, Deep learning, nVidia, nVidia A100, nVidia H100, nVidia P100, nVidia V100, Performance, PyTorch, Tesla T4
Yizhou Luo, Qiang Wang, Shaohuai Shi, Jiaxin Lai, Shuhan Qi, Jiajia Zhang, Xuan Wang
Milo Lurati, Stijn Heldens, Alessio Sclocco, Ben van Werkhoven
Tags: AMD Radeon Instinct MI250X, AMD Radeon Pro W6600, ATI, Computer science, CUDA, HIP, nVidia, nVidia A100, nVidia RTX A4000, Package, Performance, Python
Pinxue Zhao, Hailin Zhang, Fangcheng Fu, Xiaonan Nie, Qibin Liu, Fang Yang, Yuanbo Peng, Dian Jiao, Shuaipeng Li, Jinbao Xue, Yangyu Tao, Bin Cui
Weihang Gao, Teng Zhao, Yongfa Guo, Jiuyang Liang, Huan Liu, Maoying Luo, Zedong Luo, Wei Qin, Yichao Wang, Qi Zhou, Shi Jin, Zhenli Xu