hgpu.org » Tela K40
Ammar Ahmad Awan, Hari Subramoni, Dhabaleswar K. Panda
Tags: Benchmarking, Caffe, Computer science, CUBLAS, CUDA, Deep learning, Intel Xeon Phi, Machine learning, nVidia, Tela K40, Tesla K80, Tesla P100
December 24, 2017 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Jailbreaking LLM-Controlled Robots
- Over-synchronization in GPU Programs
- Testing GPU Numerics: Finding Numerical Differences Between NVIDIA and AMD GPUs
- Accelerating Drug Discovery in AutoDock-GPU with Tensor Cores
- Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration
- Using modern C++ to improve CUDA programs
- General-Purpose Computing on Tensor Processors
- Superpipeline: A Universal Approach for Reducing GPU Memory Usage in Large Models
- LLload: An Easy-to-Use HPC Utilization Tool
- Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach
* * *