Hiroyuki Ootomo, Katsuhisa Ozaki, Rio Yokota
Tags: Computer science, CUBLAS, CUDA, Deep learning, Linear Algebra, Machine learning, Matrix multiplication, nVidia, nVidia A100, nVidia Jetson AGX Orin, nVidia RTX 6000 Ada, nVidia Titan RTX, Package
Shixun Wu, Yujia Zhai, Jinyang Liu, Jiajun Huang, Zizhe Jian, Bryan M. Wong, Zizhong Chen
Tags: Code generation, Computer science, CUDA, GEMM, Linear Algebra, Matrix multiplication, nVidia, nVidia A100, Package, Performance, Reliability, Tesla T4
Noel Chalmers, Jakub Kurzak, Damon McDougall, Paul T. Bauman
Zhiyi Li, Douglas Orr, Valeriu Ohan, Godfrey Da costa, Tom Murray, Adam Sanders, Deniz Beker, Dominic Masters
Jonathan Wapman, Sean Treichler, Serban D. Porumbescu, John D. Owens
Mina Ashoury, Mohammad Loni, Farshad Khunjush, Masoud Daneshtalab
February 26, 2023 by
hgpuAnna Fortenberry, Stanimire Tomov
December 25, 2022 by
hgpuMuhammad Osama
Tags: Algorithms, Computer science, CUDA, Linear Algebra, load balancing, Matrix multiplication, nVidia, nVidia A100, Package, Sparse, Thesis
December 25, 2022 by
hgpuYu-Hsiang M. Tsai, Terry Cojean, Hartwig Anzt
Tags: AMD Radeon Instinct MI100, ATI, Computer science, CUDA, Linear Algebra, nVidia, nVidia A100, OpenCL, Package, performance portability, Sparse, SYCL
Jieyang Chen, Chenhao Xie, Jesun S Firoz, Jiajia Li, Shuaiwen Leon Song, Kevin Barker, Mark Raugas, Ang Li