Tags: Code generation, Computer science, Embedded high-performance computing, nVidia, nVidia Jetson AGX Xavier, nVidia Jetson Nano, nVidia Jetson TX2, OpenCL, Tesla T4, Tesla V100, Thesis
Tags: Android, Computer science, Computer vision, Embedded high-performance computing, nVidia, nVidia GeForce GTX 660, OpenCL, Package, Thesis
Tags: Embedded high-performance computing, Energy-efficient computing, FPGA, GPU, Power-efficient computing
Tags: Computer science, CUDA, Embedded high-performance computing, GPGPU-sim, Memory, nVidia, Performance
Tags: Algorithms, ARM, Computer science, Embedded high-performance computing, OpenCL, Pattern Search
Tags: Algorithms, Computer science, CUDA, Embedded high-performance computing, nVidia, nVidia GeForce 8800 GTX, OpenMP, Performance, Ultrasound
Recent source codes
Most viewed papers (last 30 days)
- DITRON: Distributed Multi-level Tiling Compiler for Parallel Tensor Programs
- CuBridge: An LLM-Based Framework for Understanding and Reconstructing High-Performance Attention Kernels
- KEET: Explaining Performance of GPU Kernels Using LLM Agents
- CUDAHercules: Benchmarking Hardware-Aware Expert-level CUDA Optimization for LLMs
- Kerncap: Automated Kernel Extraction and Isolation for AMD GPUs
- KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels
- Pretraining large language models with MXFP4 on Native FP4 Hardware
- Microbenchmark-Driven Analytical Performance Modeling Across Modern GPU Architectures
- CUDABeaver: Benchmarking LLM-Based Automated CUDA Debugging
- Source-to-Source Transformations for GPU Code Generation




