Yuhsiang M. Tsai, Terry Cojean, Tobias Ribizel, Hartwig Anzt
Somashekaracharya G. Bhaskaracharya, Julien Demouth, Vinod Grover
Tags: Compilers, Computer science, CUBLAS, CUDA, Deep learning, Matrix multiplication, nVidia, nVidia Quadro GV100, Performance, Programming Languages, PTX
Szilárd Páll, Artem Zhmurov, Paul Bauer, Mark Abraham, Magnus Lundborg, Alan Gray, Berk Hess, Erik Lindahl
Tags: Algorithms, AMD Radeon Instinct Mi50, AMD Vega FE, ATI, Chemistry, CUDA, Heterogeneous systems, Molecular dynamics, MPI, nVidia, nVidia GeForce RTX 2080, nVidia Quadro P 6000, OpenCL, Package, Tesla V100
Lianmin Zheng, Chengfan Jia, Minmin Sun, Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, Ion Stoica
Johannes Blühdorn, Nicolas R. Gauger, Matthias Kabel
Ben van Werkhoven, Willem Jan Palenstijn, Alessio Sclocco
Nikolay Banar, Walter Daelemans, Mike Kestemont
Behnam Pourghassemi, Chenghao Zhang, Joo Hwan Lee, Aparna Chandramowlishwaran
Jay H. Park, Gyeongchan Yun, Chang M. Yi, Nguyen T. Nguyen, Seungmin Lee, Jaesik Choi, Sam H. Noh, Young-ri Choi
Tags: Computer science, CUDA, Data parallelism, Deep learning, GPU cluster, Heterogeneous systems, Neural networks, nVidia, nVidia GeForce GTX Titan V, nVidia GeForce RTX 2060, nVidia Quadro P 4000, nVidia Titan RTX
Tongsheng Geng, Marcos Amaris, Stephane Zuckerman, Alfredo Goldman, Guang R. Gao, Jean-Luc Gaudiot
Tags: Computer science, CUDA, Heterogeneous systems, Machine learning, nVidia, nVidia GeForce GTX Titan, Performance, Sparse matrix, Task scheduling, Tesla K20, Tesla K40