hgpu.org » Tesla M40
Jeremy Appleyard, Tomas Kocisky, Phil Blunsom
Tags: Computer science, CUDA, Deep learning, Lua, Neural networks, nVidia, Package, RNN, Tesla M40
April 9, 2016 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- The Anatomy of a Triton Attention Kernel
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
- Scalable GPU-Based Integrity Verification for Large Machine Learning Models
- INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
- An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
- Enhancing Transformer Performance and Portability through Auto-tuning Frameworks
- A Study of Floating-Point Precision Tuning in Deep Learning Operators Implementations
- RDMA Point-to-Point Communication for LLM Systems
- ProofWright: Towards Agentic Formal Verification of CUDA
- Inside VOLT: Designing an Open-Source GPU Compiler
* * *




