hgpu.org » Embedded high-performance computing
Kulin V. Seth
Tags: Benchmarking, Computer science, DSP, Embedded high-performance computing, Heterogeneous systems, OpenCL, Optimization, Thesis
September 23, 2011 by hgpu
Jason Loew, Jesse Elwell, Dmitry Ponomarev, Patrick H. Madden
September 23, 2011 by hgpu
Shuai Mu, Chenxi Wang, Ming Liu, Dongdong Li, Maohua Zhu, Xiaoliang Chen, Xiang Xie, Yangdong Deng
May 30, 2011 by hgpu
Muhsen Owaida, Nikolaos Bellas, Konstantis Daloukas, Christos D. Antonopoulos
Tags: Code generation, Compilers, Computer science, Electronic design automation, Embedded high-performance computing, FPGA, Heterogeneous systems, OpenCL
May 21, 2011 by hgpu
T. Scogland, H. Lin, W. Feng
Tags: Computer science, Embedded high-performance computing, Energy-efficient computing, Green, Performance
November 2, 2010 by hgpu
Recent source codes
* * *
Most viewed papers (last 30 days)
- Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study
- AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search
- LLMQ: Efficient Lower-Precision LLM Training for Consumer GPUs
- CuTeGen: An LLM-Based Agentic Framework for Generation and Optimization of High-Performance GPU Kernels using CuTe
- DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation
- MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?
- Mixed-precision numerics in scientific applications: survey and perspectives
- Triton-Sanitizer: A Fast and Device-Agnostic Memory Sanitizer for Triton with Rich Diagnostic Context
- SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits
- MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU
* * *



