Taesu Kim, Jongho Lee, Daehyun Ahn, Sarang Kim, Jiwoong Choi, Minkyu Kim, Hyungjun Kim
Tags: Computer science, CUDA, Deep learning, Machine learning, Matrix multiplication, Mixed precision, nVidia, nVidia A100, nVidia GeForce RTX 4090, nVidia RTX A6000, Package
February 18, 2024 by
hgpuDaya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y. Wu, Y.K. Li, Fuli Luo, Yingfei Xiong, Wenfeng Liang
February 12, 2024 by
hgpuHunter McCoy, Prashant Pandey
Tyler Sorensen, Heidy Khlaaf
Robert Jendersie, Christian Lessig, Thomas Richter
Tags: Computer science, CUDA, Earth and Space Sciences, Finite element method, Numerical Analysis, nVidia, nVidia A100, nVidia GeForce RTX 3090, OpenMP, Package, PyTorch, SYCL
Andrea Montessori, Michele La Rocca, Giorgio Amati, Marco Lauricella, Adriano Tiribocchi, Sauro Succi
Jolly Chen, Monica Dessole, Ana Lucia Varbanescu
Jiacheng Yang, Christina Giannoula, Jun Wu, Mostafa Elhoushi, James Gleeson, Gennady Pekhimenko
Tags: Cloud, Computer science, CUDA, Matrix multiplication, nVidia, nVidia GeForce RTX 2070, nVidia GeForce RTX 2080 Ti, nVidia GeForce RTX 3090, Package, Performance, PyTorch, Tesla A100
Tsung-Wei Huang, Boyang Zhang, Dian-Lun Lin, Cheng-Hsiang Chiu
Qian Gong, Jieyang Chen, Ben Whitney, Xin Liang, Viktor Reshniak, Tania Banerjee, Jaemoon Lee, Anand Rangarajan, Lipeng Wan, Nicolas Vidal, Qing Liu, Ana Gainaru, Norbert Podhorszki, Richard Archibald, Sanjay Ranka, Scott Klasky
Tags: Compression, Computer science, CUDA, HIP, HPC, Numerical Analysis, nVidia, nVidia V100, OpenMP, Package, SYCL
Shiwei Zhang, Lansong Diao, Chuan Wu, Zongyan Cao, Siyu Wang, Wei Lin
Tags: Computer science, CUDA, Deep learning, Distributed computing, GPU cluster, nVidia, nVidia A100, nVidia P100, nVidia V100, Package, PyTorch