Tags: Algorithms, ATI, ATI Radeon HD 6550, Computer science, Heterogeneous systems, List ranking, OpenCL
Tags: Algorithms, Cell processor, Computer science, CUDA, Distributed computing, List ranking, nVidia, nVidia GeForce GTX 280, nVidia GeForce GTX 580, OpenCL, Sorting, Thesis
Tags: Algorithms, Computer science, CUDA, Heterogeneous systems, Information Retrieval, List ranking, nVidia, nVidia GeForce GTX 480, Tesla C1060, Tesla C2050, Thesis
Tags: Algorithms, Computer science, CUDA, List ranking, Monte Carlo simulation, nVidia, Pseudo-random number generators, Tesla C1060
Tags: Algorithms, Computer science, CUBLAS, CUDA, List ranking, nVidia, Sparse matrix, Tesla T20
Tags: Algorithms, Computer science, CUDA, List ranking, nVidia, Sorting, Tesla C1060
Tags: ATI, ATI CAL, ATI IL, ATI Radeon HD 5870, ATI Stream, Computer science, List ranking, OpenCL, Sparse matrix
Tags: Computer science, CUDA, List ranking, nVidia, Tesla C1060
Recent source codes
Most viewed papers (last 30 days)
- Data-efficient LLM Fine-tuning for Code Generation
- LithOS: An Operating System for Efficient Machine Learning on GPUs
- Large Language Model Powered C-to-CUDA Code Translation: A Novel Auto-Parallelization Framework
- MSCCL++: Rethinking GPU Communication Abstractions for Cutting-edge AI Applications
- GigaAPI for GPU Parallelization
- Scalability Evaluation of HPC Multi-GPU Training for ECG-based LLMs
- A Power-Efficient Scheduling Approach in a Cpu-Gpu Computing System by Thread-Based Parallel Programming
- DeepCompile: A Compiler-Driven Approach to Optimizing Distributed Deep Learning Training
- InteropUnityCUDA: A Tool for Interoperability Between Unity and CUDA
- GPU-centric Communication Schemes for HPC and ML Applications