hgpu.org » nVidia Quadro P 620
Lazaros Papadopoulos, Dimitris John Soudris, Christoph Kessler, August Ernstsson, Johan Ahlqvist, Nikos Vasilas, Athanasios Papadopoulos, Panos Seferlis, Charles Prouveur, Matthieu Haefele, Samuel Paul Thibault, Athanasios Salamanis, Theodoros Ioakimidis, Dionisis D. Kehagias
Tags: Computer science, CUDA, FPGA, Heterogeneous systems, MPI, nVidia, nVidia Quadro P 620, OpenCL, OpenMP, Tesla P100, Tesla V100
August 22, 2021 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- The Anatomy of a Triton Attention Kernel
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
- KernelBand: Boosting LLM-based Kernel Optimization with a Hierarchical and Hardware-aware Multi-armed Bandit
- An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
- RDMA Point-to-Point Communication for LLM Systems
- ProofWright: Towards Agentic Formal Verification of CUDA
- QiMeng-Kernel: Macro-Thinking Micro-Coding Paradigm for LLM-Based High-Performance GPU Kernel Generation
- Inside VOLT: Designing an Open-Source GPU Compiler
- Iris: First-Class Multi-GPU Programming Experience in Triton
- AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs
* * *



