Rafael Gadea-Gironés, José Luís Rocabado-Rocha, Jorge Fe, Jose M. Monzo
Maoxue Yu, Guanghao Ma, Zhuoya Wang, Shuai Tang, Yuhu Chen, Yucheng Wang, Yuanyuan Liu, Dongning Jia, Zhiqiang Wei
Jiacheng Yang, Christina Giannoula, Jun Wu, Mostafa Elhoushi, James Gleeson, Gennady Pekhimenko
Tags: Cloud, Computer science, CUDA, Matrix multiplication, nVidia, nVidia GeForce RTX 2070, nVidia GeForce RTX 2080 Ti, nVidia GeForce RTX 3090, Package, Performance, PyTorch, Tesla A100
Tsung-Wei Huang, Boyang Zhang, Dian-Lun Lin, Cheng-Hsiang Chiu
Qian Gong, Jieyang Chen, Ben Whitney, Xin Liang, Viktor Reshniak, Tania Banerjee, Jaemoon Lee, Anand Rangarajan, Lipeng Wan, Nicolas Vidal, Qing Liu, Ana Gainaru, Norbert Podhorszki, Richard Archibald, Sanjay Ranka, Scott Klasky
Tags: Compression, Computer science, CUDA, HIP, HPC, Numerical Analysis, nVidia, nVidia V100, OpenMP, Package, SYCL
Shiwei Zhang, Lansong Diao, Chuan Wu, Zongyan Cao, Siyu Wang, Wei Lin
Tags: Computer science, CUDA, Deep learning, Distributed computing, GPU cluster, nVidia, nVidia A100, nVidia P100, nVidia V100, Package, PyTorch
John Jacobson, Martin Burtscher, Ganesh Gopalakrishnan
Wei-Chen Lin, Simon McIntosh-Smith, Tom Deakin
Foteini Strati, Xianzhe Ma, Ana Klimovic
Ashwina Kumar, M. Venkata Krishna, Prasanna Bartakke, Rahul Kumar, Rajesh Pandian M, Nibedita Behera, Rupesh Nasre
Tags: Code generation, Computer science, CUDA, DSL, nVidia, nVidia GeForce RTX 2080 Ti, OpenACC, OpenCL, Package, SYCL, Tesla V100