hgpu.org » Comuter science
Kai Zhu, Wenyi Zhao, Zhen Zheng, Tianyou Guo, Pengzhan Zhao, Junjie Bai, Jun Yang, Xiaoyong Liu, Lansong Diao, Wei Lin
Tags: Compilers, Comuter science, CUDA, Machine learning, nVidia, Tesla T4
March 14, 2021 by hgpu
Tiago Augusto Engel, Andrea Schwertner Charao, Manuele Kirsch-Pinheiro, Luiz-Angelo Steffenel
Tags: Comuter science, CUDA, Data mining, Java, Matrix multiplication, nVidia, nVidia Quadro K 2000, Package, Tesla K20, Tesla M2050
June 13, 2014 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- The Anatomy of a Triton Attention Kernel
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
- Scalable GPU-Based Integrity Verification for Large Machine Learning Models
- INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
- An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
- Enhancing Transformer Performance and Portability through Auto-tuning Frameworks
- KernelBand: Boosting LLM-based Kernel Optimization with a Hierarchical and Hardware-aware Multi-armed Bandit
- RDMA Point-to-Point Communication for LLM Systems
- A Study of Floating-Point Precision Tuning in Deep Learning Operators Implementations
- ProofWright: Towards Agentic Formal Verification of CUDA
* * *




