hgpu.org » pyCUDA
Richard Schoonhoven, Ben van Werkhoven, Kees Joost Batenburg
Tags: AMD Radeon Instinct Mi50, ATI, Auto-Tuning, Benchmarking, Computer science, CUDA, nVidia, nVidia A100, nVidia GeForce GTX 1080 Ti, nVidia GeForce GTX Titan X, nVidia Titan RTX, OpenCL, Performance, pyCUDA, PyOpenCL, Tesla K20, Tesla P100, Tesla V100
October 9, 2022 by hgpu
Florencio Balboa Usabiaga, Blaise Delmotte, Aleksandar Donev
Tags: Condensed matter, CUDA, nVidia, Package, Physics, pyCUDA, Soft Condensed Matter
December 6, 2016 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study
- LLMQ: Efficient Lower-Precision LLM Training for Consumer GPUs
- AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search
- An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU
- DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation
- KernelFoundry: Hardware-aware evolutionary GPU kernel optimization
- MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?
- CuTeGen: An LLM-Based Agentic Framework for Generation and Optimization of High-Performance GPU Kernels using CuTe
- Mixed-precision numerics in scientific applications: survey and perspectives
- True 4-Bit Quantized Convolutional Neural Network Training on CPU: Achieving Full-Precision Parity
* * *




