Tags: Algorithms, ATI, ATI Radeon HD 6550, Computer science, Heterogeneous systems, List ranking, OpenCL
Tags: Algorithms, Cell processor, Computer science, CUDA, Distributed computing, List ranking, nVidia, nVidia GeForce GTX 280, nVidia GeForce GTX 580, OpenCL, Sorting, Thesis
Tags: Algorithms, Computer science, CUDA, Heterogeneous systems, Information Retrieval, List ranking, nVidia, nVidia GeForce GTX 480, Tesla C1060, Tesla C2050, Thesis
Tags: Algorithms, Computer science, CUDA, List ranking, Monte Carlo simulation, nVidia, Pseudo-random number generators, Tesla C1060
Tags: Algorithms, Computer science, CUBLAS, CUDA, List ranking, nVidia, Sparse matrix, Tesla T20
Tags: Algorithms, Computer science, CUDA, List ranking, nVidia, Sorting, Tesla C1060
Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5870, ATI Stream, Computer science, List ranking, OpenCL, Sparse matrix
Tags: Computer science, CUDA, List ranking, nVidia, Tesla C1060
Recent source codes
Most viewed papers (last 30 days)
- StitchCUDA: An Automated Multi-Agents End-to-End GPU Programing Framework with Rubric-based Agentic Reinforcement Learning
- Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4
- CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
- Catalyst-Agent: Autonomous heterogeneous catalyst screening and optimization with an LLM Agent
- Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study
- EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery
- Joint Training on AMD and NVIDIA GPUs
- Practical FP4 Training for Large-Scale MoE Models on Hopper GPUs
- CUDABench: Benchmarking LLMs for Text-to-CUDA Generation
- CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models



