hgpu.org » nVidia Quadro K420
Steven W. D. Chien, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav Bulatov, Erwin Laure, Jeffrey S. Vetter
Tags: Benchmarking, Computer science, CUDA, Deep learning, FFT, Heterogeneous systems, HPC, Machine learning, nVidia, nVidia Quadro K420, OpenMPI, Package, Performance, Python, TensorFlow, Tesla K80, Tesla V100
March 17, 2019 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Revealing NVIDIA Closed-Source Driver Command Streams for CPU-GPU Runtime Behavior Insight
- Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs
- DITRON: Distributed Multi-level Tiling Compiler for Parallel Tensor Programs
- FACT: Compositional Kernel Synthesis with a Three-Stage Agentic Workflow
- CuBridge: An LLM-Based Framework for Understanding and Reconstructing High-Performance Attention Kernels
- CUDAHercules: Benchmarking Hardware-Aware Expert-level CUDA Optimization for LLMs
- KEET: Explaining Performance of GPU Kernels Using LLM Agents
- ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants
- Kerncap: Automated Kernel Extraction and Isolation for AMD GPUs
- A Human–Machine Collaborative Tuning Framework for Triton Kernel Optimization on SIMD Platforms
* * *




