hgpu.org » nVidia A40
Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
Tags: Computer science, CUDA, Deep learning, Machine learning, Matrix multiplication, nVidia, nVidia A40, Package, PyTorch
August 21, 2022 by hgpu
* * *
Recent source codes
* * *
Most viewed papers (last 30 days)
- Towards a Benchmarking Suite for Kernel Tuners
- Harmonic CUDA: Asynchronous Programming on GPUs
- DeepAxe: A Framework for Exploration of Approximation and Reliability Trade-offs in DNN Accelerators
- Challenges and Opportunities in C/C++ Source-To-Source Compilation
- BenchDirect: A Directed Language Model for Compiler Benchmarks
- A Deep Learning Model for Loop Interchange
- Stellar Mergers with HPX-Kokkos and SYCL: Methods of using an Asynchronous Many-Task Runtime System with SYCL
- ARK: GPU-driven Code Execution for Distributed Deep Learning
- EvoTorch: Scalable Evolutionary Computation in Python
- RTIndeX: Exploiting Hardware-Accelerated GPU Raytracing for Database Indexing
* * *