Somashekaracharya G. Bhaskaracharya, Julien Demouth, Vinod Grover
Tags: Compilers, Computer science, CUBLAS, CUDA, Deep learning, Matrix multiplication, nVidia, nVidia Quadro GV100, Performance, Programming Languages, PTX
Gangwon Jo, Heehoon Kim, Jeesoo Lee, Jaejin Lee
Weicheng Xue, Christoper J. Roy
Tom Deakin, Simon McIntosh-Smith
Sohan Lal, Aksel Alpay, Philip Salzmann, Biagio Cosenza, Nicolai Stawinoga, Peter Thoman, Thomas Fahringer, Vincent Heuveline
Tongsheng Geng, Marcos Amaris, Stephane Zuckerman, Alfredo Goldman, Guang R. Gao, Jean-Luc Gaudiot
Tags: Computer science, CUDA, Heterogeneous systems, Machine learning, nVidia, nVidia GeForce GTX Titan, Performance, Sparse matrix, Task scheduling, Tesla K20, Tesla K40
Kaijie Fan, Biagio Cosenza, Ben Juurlink
Michael Knobloch, Bernd Mohr
Arya Mazaheri, Tim Beringer, Matthew Moskewicz, Felix Wolf, Ali Jannesari
Tags: AMD Radeon RX 580, ATI, Code generation, Computer science, CUDA, Deep learning, Neural networks, nVidia, nVidia GeForce GTX 1080 Ti, OpenCL, Performance, performance portability, Symbolic Computation, Vulkan