hgpu.org » Dense linear algebra
Chetan Jhurani, Paul Mullowney
Tags: BLAS, CUBLAS, CUDA, Dense linear algebra, GEMM, Linear Algebra, nVidia, Parallel programming, Tesla K20
April 9, 2013 by chetan.jhurani
* * *
Recent source codes
* * *
Most viewed papers (last 30 days)
- OpenMP Advisor
- Cramming: Training a Language Model on a Single GPU in One Day
- A Programming Model for GPU Load Balancing
- Arax: a runtime framework for decoupling applications from heterogeneous accelerators
- Myths and Legends in High-Performance Computing
- BaCO: A Fast and Portable Bayesian Compiler Optimization Framework
- oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation
- A Domain-Extensible Compiler with Controllable Automation of Optimisations
- Efficient OpenCL system integration of non-blocking FPGA accelerators
- Improving the scalability of modern applications by parallel multi-core and many-core programming
* * *