Daniele De Sensi, Lorenzo Pichetti, Flavio Vella, Tiziano De Matteis, Zebin Ren, Luigi Fusco, Matteo Turisini, Daniele Cesarini, Kurt Lust, Animesh Trivedi, Duncan Roweth, Filippo Spiga, Salvatore Di Girolamo, Torsten Hoefler
Tags: AMD Radeon Instinct MI250X, ATI, Benchmarking, Computer science, CUDA, HPC, MPI, nVidia, nVidia A100, nVidia H100, Performance
September 1, 2024 by
hgpuJiří Klepl, Adam Šmelko, Lukáš Rozsypal, Martin Kruliš
Brandon Alexander Burtchell, Martin Burtscher
Lukas Armborst, Pieter Bos, Lars B. van den Haak, Marieke Huisman, Robert Rubbens, Ömer Şakar, Philip Tasche
Mert Hidayetoglu, Simon Garcia de Gonzalo, Elliott Slaughter, Pinku Surana, Wen-mei Hwu, William Gropp, Alex Aiken
Qipeng Wang, Shiqi Jiang, Zhenpeng Chen, Xu Cao, Yuanchun Li, Aoyu Li, Yun Ma, Ting Cao, Xuanzhe Liu
Tags: Computer science, CUDA, Deep learning, nVidia, nVidia GeForce GTX 1060, nVidia GeForce GTX 980, nVidia GeForce RTX 2060, OpenCL, Package, Performance, TensorFlow
Yi Ju, Mingshuai Li, Adalberto Perez, Laura Bellentani, Niclas Jansson, Stefano Markidis, Philipp Schlatter, Erwin Laure
Junqing Lin, Jingwei Sun, Xiaolong Shi, Honghe Zhang, Xianzhi Yu, Xinzhi Wang, Jun Yao, Guangzhong Sun
Tags: Compilers, Computer science, CUDA, Deep learning, Linear Algebra, Matrix multiplication, Neural networks, nVidia, nVidia GeForce RTX 2080 Ti, Performance, Sparse matrix, Tesla V100
Gabriele Mencagli, Patrizio Dazzi, Massimo Coppola
Seonho Lee, Amar Phanishayee, Divya Mahajan
Tags: Computer science, CUDA, Deep learning, nVidia, nVidia A100, nVidia H100, nVidia P100, nVidia V100, Performance, PyTorch, Tesla T4