Tags: Code generation, Computer science, Embedded high-performance computing, nVidia, nVidia Jetson AGX Xavier, nVidia Jetson Nano, nVidia Jetson TX2, OpenCL, Tesla T4, Tesla V100, Thesis
Tags: Android, Computer science, Computer vision, Embedded high-performance computing, nVidia, nVidia GeForce GTX 660, OpenCL, Package, Thesis
Tags: Embedded high-performance computing, Energy-efficient computing, FPGA, GPU, Power-efficient computing
Tags: Computer science, CUDA, Embedded high-performance computing, GPGPU-sim, Memory, nVidia, Performance
Tags: Algorithms, ARM, Computer science, Embedded high-performance computing, OpenCL, Pattern Search
Tags: Algorithms, Computer science, CUDA, Embedded high-performance computing, nVidia, nVidia GeForce 8800 GTX, OpenMP, Performance, Ultrasound
Recent source codes
Most viewed papers (last 30 days)
- Omniwise: Predicting GPU Kernels Performance with LLMs
- P4OMP: Retrieval-Augmented Prompting for OpenMP Parallelism in Serial Code
- Engineering Supercomputing Platforms for Biomolecular Applications
- CUDA-LLM: LLMs Can Write Efficient CUDA Kernels
- GCStack+GCScaler: Fast and Accurate GPU Performance Analyses Using Fine-Grained Stall Cycle Accounting and Interval Analysis
- A First Look at Bugs in LLM Inference Engines
- ParEval-Repo: A Benchmark Suite for Evaluating LLMs with Repository-level HPC Translation Tasks
- Efficient GPU Implementation of Multi-Precision Integer Division
- Accelerated discovery and design of Fe-Co-Zr magnets with tunable magnetic anisotropy through machine learning and parallel computing
- chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations