hgpu.org » Dense linear algebra
Chetan Jhurani, Paul Mullowney
Tags: BLAS, CUBLAS, CUDA, Dense linear algebra, GEMM, Linear Algebra, nVidia, Parallel programming, Tesla K20
April 9, 2013 by chetan.jhurani
Recent source codes
* * *
Most viewed papers (last 30 days)
- Jailbreaking LLM-Controlled Robots
- Over-synchronization in GPU Programs
- Testing GPU Numerics: Finding Numerical Differences Between NVIDIA and AMD GPUs
- Accelerating Drug Discovery in AutoDock-GPU with Tensor Cores
- Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration
- Using modern C++ to improve CUDA programs
- General-Purpose Computing on Tensor Processors
- Superpipeline: A Universal Approach for Reducing GPU Memory Usage in Large Models
- LLload: An Easy-to-Use HPC Utilization Tool
- Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach
* * *