Ivan R. Ivanov, Oleksandr Zinenko, Jens Domke, Toshio Endo, William S. Moses
Tags: AMD Radeon Instinct MI210, AMD Radeon RX 6800, ATI, Computer science, CUDA, HIP, HPC, nVidia, nVidia A100, nVidia RTX A4000, Package, performance portability
John Tramm, Paul Romano, Patrick Shriwise, Amanda Lund, Johannes Doerfert, Patrick Steinbrecher, Andrew Siegel, Gavin Ridley
Tags: AMD Radeon Instinct MI250X, ATI, Computer science, CUDA, Intel, Intel Data Center GPU Max 1550, Intel Ponte Vecchio Max 1100, nVidia, nVidia A100, OpenMP, Package, performance portability
Adrian Perez Dieguez, Min Choi, Mahmut Okyay, Mauro Del Ben, Bryan M. Wong, Khaled Z. Ibrahim
Andres E. Tomas, Enrique S. Quintana-Orti, Hartwig Anzt
Xinyi Li, Ang Li, Bo Fang, Katarzyna Swirydowicz, Ignacio Laguna, Ganesh Gopalakrishnan
Tags: AMD Radeon Instinct MI100, AMD Radeon Instinct MI250X, ATI, Computer science, Hardware Architecture, HPC, Matrix multiplication, nVidia, nVidia A100, nVidia H100, nVidia V100, PTX
Ali Asadi, Amintor Dusko, Chae-Yeun Park, Vincent Michaud-Rioux, Isidor Schoch, Shuli Shu, Trevor Vincent, Lee James O'Riordan
Dan Zhao, Siddharth Samsi, Joseph McDonald, Baolin Li, David Bestor, Michael Jones, Devesh Tiwari, Vijay Gadepally
Weile Luo, Ruibo Fan, Zeyu Li, Dayou Du, Qiang Wang, Xiaowen Chu
Tags: Artificial intelligence, Benchmarking, Computer science, CUDA, Deep learning, nVidia, nVidia A100, nVidia GeForce RTX 4090, nVidia H800, Performance, PTX
February 25, 2024 by
hgpuTaesu Kim, Jongho Lee, Daehyun Ahn, Sarang Kim, Jiwoong Choi, Minkyu Kim, Hyungjun Kim
Tags: Computer science, CUDA, Deep learning, Machine learning, Matrix multiplication, Mixed precision, nVidia, nVidia A100, nVidia GeForce RTX 4090, nVidia RTX A6000, Package
February 18, 2024 by
hgpuDaya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y. Wu, Y.K. Li, Fuli Luo, Yingfei Xiong, Wenfeng Liang
February 12, 2024 by
hgpuGianmarco Accordi, Davide Gadioli, Emanele Vitali, Luigi Crisci, Biagio Cosenza, Andrea Beccari, Gianluca Palermo
February 12, 2024 by
hgpuRobert Jendersie, Christian Lessig, Thomas Richter
Tags: Computer science, CUDA, Earth and Space Sciences, Finite element method, Numerical Analysis, nVidia, nVidia A100, nVidia GeForce RTX 3090, OpenMP, Package, PyTorch, SYCL