hgpu.org » nVidia Quadro K420
Steven W. D. Chien, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav Bulatov, Erwin Laure, Jeffrey S. Vetter
Tags: Benchmarking, Computer science, CUDA, Deep learning, FFT, Heterogeneous systems, HPC, Machine learning, nVidia, nVidia Quadro K420, OpenMPI, Package, Performance, Python, TensorFlow, Tesla K80, Tesla V100
March 17, 2019 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- The Anatomy of a Triton Attention Kernel
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
- Scalable GPU-Based Integrity Verification for Large Machine Learning Models
- INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
- An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
- Enhancing Transformer Performance and Portability through Auto-tuning Frameworks
- A Study of Floating-Point Precision Tuning in Deep Learning Operators Implementations
- RDMA Point-to-Point Communication for LLM Systems
- ProofWright: Towards Agentic Formal Verification of CUDA
- Inside VOLT: Designing an Open-Source GPU Compiler
* * *




