Hugh Leather, Chris Cummins
Deepak Narayanan, Keshav Santhanam, Fiodar Kazhamiaka, Amar Phanishayee, Matei Zaharia
Tags: Computer science, CUDA, Deep learning, FPGA, GPU cluster, Heterogeneous systems, nVidia, Optimization, Task scheduling, Tesla K80, Tesla P100, Tesla V100
Zongmei Gao, Zhongwei Luo, Wen Zhang, Zhenzhen Lv, Yanlei Xu
Yuhao Zhang, Yuhui Zhang, Peng Qi, Christopher D. Manning, Curtis P. Langlotz
Jeffrey Krupa, Kelvin Lin, Maria Acosta Flechas, Jack Dinsmore, Javier Duarte, Philip Harris, Scott Hauck, Burt Holzman, Shih-Chieh Hsu, Thomas Klijnsma, Mia Liu, Kevin Pedro, Natchanon Suaysom, Matt Trahms, Nhan Tran
Xueying Wang, Guangli Li, Xiao Dong, Jiansong Li, Lei Liu, Xiaobing Feng
Sasikanth Avancha, Vasimuddin Md, Sanchit Misra, Ramanarayan Mohanty
Chao-Tung Yang, Jung-Chun Liu, Yu-Wei Chan, Endah Kristiani, Chan-Fu Kuo
Trevor Gale, Matei Zaharia, Cliff Young, Erich Elsen
Somashekaracharya G. Bhaskaracharya, Julien Demouth, Vinod Grover
Tags: Compilers, Computer science, CUBLAS, CUDA, Deep learning, Matrix multiplication, nVidia, nVidia Quadro GV100, Performance, Programming Languages, PTX