Somashekaracharya G. Bhaskaracharya, Julien Demouth, Vinod Grover
Tags: Compilers, Computer science, CUBLAS, CUDA, Deep learning, Matrix multiplication, nVidia, nVidia Quadro GV100, Performance, Programming Languages, PTX
Jhe-Yu Liou, Xiaodong Wang, Stephanie Forrest, Carole-Jean Wu
Yehia Arafa, Ammar ElWazir, Abdelrahman ElKanishy, Youssef Aly, Ayatelrahman Elsayed, Abdel-Hameed Badawy, Gopinath Chennupati, Stephan Eidenbenz, Nandakishore Santhi
February 23, 2020 by
hgpuYehia Arafa, Abdel-Hameed Badawy, Gopinath Chennupati, Nandakishore Santhi, Stephan Eidenbenz
Tags: Benchmarking, Computer science, CUDA, nVidia, nVidia GeForce GTX Titan X, nVidia Titan RTX, Performance, PTX, Tesla K40, Tesla P100, Tesla V100
Benjamin Ferrell, Jun Duan, Kevin W. Hamlen
Zhe Jia, Marco Maggioni, Jeffrey Smith, Daniele Paolo Scarpazza
Ricardo Nobre, Luis Reis, Joao M. P. Cardoso
Zhe Jia, Marco Maggioni, Benjamin Staiger, Daniele P. Scarpazza
Riyadh Baghdadi, Jessica Ray, Malek Ben Romdhane, Emanuele Del Sozzo, Patricia Suriana, Shoaib Kamil, Saman Amarasinghe
Tags: Compilers, Computer science, DSL, FPGA, Matrix multiplication, nVidia, OpenMPI, Performance, Programming Languages, PTX, Tesla K40
Philippe Tillet, David Cox
Tags: Auto-Tuning, Computer science, CUDA, Deep learning, nVidia, nVidia GeForce GTX 980 Ti, OpenCL, Package, Performance, PTX, Tesla P100
February 17, 2018 by
hgpuScott Cyphers, Arjun K. Bansal, Anahita Bhiwandiwalla, Jayaram Bobba, Matthew Brookhart, Avijit Chakraborty, Will Constable, Christian Convey, Leona Cook, Omar Kanawi, Robert Kimball, Jason Knight, Nikolay Korovaiko, Varun Kumar, Yixing Lao, Christopher R. Lishka, Jaikrishnan Menon, Jennifer Myers, Sandeep Aswath Narayana, Adam Procter, Tristan J. Webb