Navdeep Katel, Vivek Khandelwal, Uday Bondhugula
September 5, 2021 by
hgpuAbhinav Jangda, Jun Huang, Guodong Liu, Amir Hossein Nodehi Sabet, Saeed Maleki, Youshan Miao, Madanlal Musuvathi, Todd Mytkowicz, Olli Sarikivi
Ahmad Abdelfattah, Mohammed Al Farhan, Cade Brown, Mark Gates, Dalal Sukkari, Asim YarKhan, Jack Dongarra
Xiaoyan Liu, Yi Liu, Ming Dun, Bohong Yin, Hailong Yang, Zhongzhi Luan, Depei Qian
Thomas Faingnaert, Tim Besard, Bjorn De Sutter
Tags: Computer science, CUBLAS, CUDA, Julia, Machine learning, Mathematical Software, Matrix multiplication, Mixed precision, nVidia, nVidia GeForce RTX 2080 Ti, Package, Performance
Somashekaracharya G. Bhaskaracharya, Julien Demouth, Vinod Grover
Tags: Compilers, Computer science, CUBLAS, CUDA, Deep learning, Matrix multiplication, nVidia, nVidia Quadro GV100, Performance, Programming Languages, PTX
Bastian Hagedorn, Archibald Samuel Elliott, Henrik Barthels, Rastislav Bodik, Vinod Grover
Dominik Ernst, Georg Hager, Jonas Thies, Gerhard Wellein
Jianyu Huang, Chenhan D. Yu, Robert A. van de Geijn
September 2, 2018 by
hgpuAmmar Ahmad Awan, Hari Subramoni, Dhabaleswar K. Panda
Tags: Benchmarking, Caffe, Computer science, CUBLAS, CUDA, Deep learning, Intel Xeon Phi, Machine learning, nVidia, Tela K40, Tesla K80, Tesla P100
December 24, 2017 by
hgpu