hgpu.org » OpemMP
Amir Gholami, Judith Hill, Dhairya Malhotra, George Biros
Tags: Algorithms, Computer science, CUDA, FFT, MPI, nVidia, OpemMP, Package, Tesla K20
June 30, 2015 by hgpu
Recent source codes
RepoLaunch: Automating Build and Test Pipeline of Code Repositories on ANY Language and ANY Platform
RepoLaunch: Automating Build and Test Pipeline of Code Repositories on ANY Language and ANY Platform
* * *
Most viewed papers (last 30 days)
- DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels
- Deep Kernel Fusion for Transformers
- Improving HPC Code Generation Capability of LLMs via Online Reinforcement Learning with Real-Machine Benchmark Rewards
- StitchCUDA: An Automated Multi-Agents End-to-End GPU Programing Framework with Rubric-based Agentic Reinforcement Learning
- Execution-Centric Characterization of FP8 Matrix Cores, Asynchronous Execution, and Structured Sparsity on AMD MI300A
- Catalyst-Agent: Autonomous heterogeneous catalyst screening and optimization with an LLM Agent
- A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5
- CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
- Joint Training on AMD and NVIDIA GPUs
- Fine-Tuning GPT-5 for GPU Kernel Generation
* * *




