hgpu.org » Dense linear algebra
Chetan Jhurani, Paul Mullowney
Tags: BLAS, CUBLAS, CUDA, Dense linear algebra, GEMM, Linear Algebra, nVidia, Parallel programming, Tesla K20
April 9, 2013 by chetan.jhurani
Recent source codes
* * *
Most viewed papers (last 30 days)
- Optimizing Deep Learning Models For Raspberry Pi
- Towards Alignment of Parallelism in SYCL and ISO C++
- Improving Energy Efficiency of Basic Linear Algebra Routines on Heterogeneous Systems with Multiple GPUs
- Descend: A Safe GPU Systems Programming Language
- Redwood: Flexible and Portable Heterogeneous Tree Traversal Workloads
- FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs
- Prediction of Performance and Power Consumption of GPGPU Applications
- TorchBench: Benchmarking PyTorch with High API Surface Coverage
- Performance Optimization using Multimodal Modeling and Heterogeneous GNN
- An Asynchronous Dataflow-Driven Execution Model For Distributed Accelerator Computing
* * *