Tatsumi Aoyama, Ken-Ichi Ishikawa, Yasuyuki Kimura, Hideo Matsufuru, Atsushi Sato, Tomohiro Suzuki, Sunao Torii

Steven Eliuk, Cameron Upright, Anthony Skjellum

Tags: Computer science, CUDA, Deep learning, Heterogeneous systems, Linear Algebra, Matrix multiplication, Neural and Evolutionary Computing, Neural networks, nVidia, OpenMPI, Tesla K80

Marat Dukhan, Richard Vuduc, Jason Riedy

Jorge F. Fabeiro, Diego Andrade, Basilio B. Fraguela

Tags: AMD FirePro S9150, ATI, Code generation, Computer science, Heterogeneous systems, Intel Xeon Phi, Matrix multiplication, nVidia, OpenCL, Package, Performance, performance portability, Tesla K20

February 11, 2016 by

hgpuToomas Remmelg, Thibaut Lutz, Michel Steuwer, Christophe Dubach

February 10, 2016 by

hgpuA. Abdelfattah, M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, Tz. Kolev, I. Masliah, S. Tomov

Tags: Algorithms, Code generation, Computer science, CUDA, FEM, Finite element method, Linear Algebra, Matrix multiplication, nVidia, OpenMP, Tesla K40

Linnan Wang, Wei Wu, Jianxiong Xiao, Yang Yi

Tags: Algorithms, Computer science, CUDA, Deep learning, Machine learning, Matrix multiplication, Neural and Evolutionary Computing, Neural networks, nVidia, nVidia GeForce GTX Titan X, Tesla K40

November 20, 2015 by

hgpu