hgpu.org » nVidia Quadro P 620
Lazaros Papadopoulos, Dimitris John Soudris, Christoph Kessler, August Ernstsson, Johan Ahlqvist, Nikos Vasilas, Athanasios Papadopoulos, Panos Seferlis, Charles Prouveur, Matthieu Haefele, Samuel Paul Thibault, Athanasios Salamanis, Theodoros Ioakimidis, Dionisis D. Kehagias
Tags: Computer science, CUDA, FPGA, Heterogeneous systems, MPI, nVidia, nVidia Quadro P 620, OpenCL, OpenMP, Tesla P100, Tesla V100
August 22, 2021 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Revealing NVIDIA Closed-Source Driver Command Streams for CPU-GPU Runtime Behavior Insight
- CuTeGen: An LLM-Based Agentic Framework for Generation and Optimization of High-Performance GPU Kernels using CuTe
- MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU
- Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs
- Agentic Code Optimization via Compiler-LLM Cooperation
- FACT: Compositional Kernel Synthesis with a Three-Stage Agentic Workflow
- DITRON: Distributed Multi-level Tiling Compiler for Parallel Tensor Programs
- DVM: Real-Time Kernel Generation for Dynamic AI Models
- ARGUS: Agentic GPU Optimization Guided by Data-Flow Invariants
- Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization
* * *



