hgpu.org » nVidia Quadro P 620
Lazaros Papadopoulos, Dimitris John Soudris, Christoph Kessler, August Ernstsson, Johan Ahlqvist, Nikos Vasilas, Athanasios Papadopoulos, Panos Seferlis, Charles Prouveur, Matthieu Haefele, Samuel Paul Thibault, Athanasios Salamanis, Theodoros Ioakimidis, Dionisis D. Kehagias
Tags: Computer science, CUDA, FPGA, Heterogeneous systems, MPI, nVidia, nVidia Quadro P 620, OpenCL, OpenMP, Tesla P100, Tesla V100
August 22, 2021 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- The Anatomy of a Triton Attention Kernel
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
- Scalable GPU-Based Integrity Verification for Large Machine Learning Models
- INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
- An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
- Enhancing Transformer Performance and Portability through Auto-tuning Frameworks
- KernelBand: Boosting LLM-based Kernel Optimization with a Hierarchical and Hardware-aware Multi-armed Bandit
- RDMA Point-to-Point Communication for LLM Systems
- A Study of Floating-Point Precision Tuning in Deep Learning Operators Implementations
- ProofWright: Towards Agentic Formal Verification of CUDA
* * *



