hgpu.org » Tela K40
Ammar Ahmad Awan, Hari Subramoni, Dhabaleswar K. Panda
Tags: Benchmarking, Caffe, Computer science, CUBLAS, CUDA, Deep learning, Intel Xeon Phi, Machine learning, nVidia, Tela K40, Tesla K80, Tesla P100
December 24, 2017 by hgpu
* * *
Recent source codes
* * *
Most viewed papers (last 30 days)
- User's needs influencing HPC technologies
- Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
- On the Compilation Performance of Current SYCL Implementations
- Fast GPU bounding boxes on tree-structured scenes
- Lossless Acceleration for Seq2seq Generation with Aggressive Decoding
- Portable, Scalable Approaches for Improving Asynchronous Many-Task Runtime Node Use
- SOL: Reducing the Maintenance Overhead for Integrating Hardware Support into AI Frameworks
- Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine Model Agreement
- Seamless GPU acceleration for C++ based physics with the Metal Shading Language on Apple's M series unified chips
- Onesweep: A Faster Least Significant Digit Radix Sort for GPUs
* * *