Chao Chen, Chris Porter, Santosh Pande
K.I. Mihajlenko, M.A. Lukin, A.S. Stankevich
Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham Leghettas, Kamel Abdous, Taha Arbaoui, Karima Benatchba, Saman Amarasinghe
Kai Zhu, Wenyi Zhao, Zhen Zheng, Tianyou Guo, Pengzhan Zhao, Junjie Bai, Jun Yang, Xiaoyong Liu, Lansong Diao, Wei Lin
Rachit Nigam, Samuel Thomas, Zhijing Li, Adrian Sampson
Biagio Cosenza, Nikita Popov, Ben Juurlink, Paul Richmond, Mozhgan Kabiri Chimeh, Carmine Spagnuolo, Gennaro Cordasco, Vittorio Scarano
Hugh Leather, Chris Cummins
Somashekaracharya G. Bhaskaracharya, Julien Demouth, Vinod Grover
Tags: Compilers, Computer science, CUBLAS, CUDA, Deep learning, Matrix multiplication, nVidia, nVidia Quadro GV100, Performance, Programming Languages, PTX
Lianmin Zheng, Chengfan Jia, Minmin Sun, Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, Ion Stoica