Views of posts on hgpu.org
Parallel Verlet neighbor list algorithm for GPU-optimized MD simulations 2,278 views
CLTune: A Generic Auto-Tuner for OpenCL Kernels 2,277 views
Implementation of 2-D Discrete Cosine Transform Algorithm on GPU 2,276 views
GPflow: A Gaussian process library using TensorFlow 2,276 views
GPU System Call 2,275 views
A block-asynchronous relaxation method for graphics processing units 2,275 views
Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level 2,275 views
Extending the Generalized Fermat Prime Number Search Beyond One Million Digits Using GPUs 2,274 views
Multiresolution Flow Simulations on Multi/many-core Architectures 2,274 views
YaDiV-an open platform for 3D visualization and 3D segmentation of medical data 2,274 views
Fast hough transform on GPUs: exploration of algorithm trade-offs 2,273 views
FluidFFT: common API (C++ and Python) for Fast Fourier Transform HPC libraries 2,272 views
3D Modeling, Distance and Gradient Computation for Motion Planning: A Direct GPGPU Approach 2,271 views
GPUVerify: A Verifier for GPU Kernels 2,271 views
Parallel Batch Training of the Self-Organizing Map Using OpenCL 2,271 views
Cache Miss Analysis for GPU Programs Based on Stack Distance Profile 2,271 views
The Heisenberg spin glass model on GPU: myths and actual facts 2,271 views
Use NVIDIA CUDA technology to create genetic algorithms with extensive population 2,270 views
Accelerating Ant Colony Optimization-based Edge Detection on the GPU using CUDA 2,270 views
A Monte Carlo Neutron Transport Code for Eigenvalue Calculations on a Dual-GPU System and CUDA Environment 2,270 views
Single Server Multi-GPU Training of ConvNets 2,270 views
Advanced Trends of Heterogeneous Computing with CPU-GPU Integration: Comparative Study 2,270 views
Sapporo2: A versatile direct N-body library 2,269 views
Parallel Matching and Clustering Algorithms on GPUs 2,269 views
Improving GPU Performance Prediction with Data Transfer Modeling 2,268 views
Grid-based SAH BVH construction on a GPU 2,268 views
All-Pairs Shortest Path Algorithms Using CUDA 2,267 views
Fast Implementation of DGEMM on Fermi GPU 2,267 views
GPUvm: Why Not Virtualizing GPUs at the Hypervisor? 2,267 views
A Heterogeneous Accelerated Matrix Multiplication: OpenCL + APU + GPU+ Fast Matrix Multiply 2,266 views
A Survey Paper on Solving TSP using Ant Colony Optimization on GPU 2,266 views
Multi-core CPU or GPU-accelerated Multiscale Modeling for Biomolecular Complexes 2,266 views
Energy Efficient Computing on Multi-core Processors: Vectorization and Compression Techniques 2,266 views
CUDAICA: GPU optimization of Infomax-ICA EEG analysis 2,266 views
GPU accelerated feature algorithms for mobile devices 2,266 views
Graph Processing on GPUs: A Survey 2,265 views
Machine Learning in Compilers: Past, Present and Future 2,265 views
GPU’s for event reconstruction in the FairRoot framework 2,265 views
A Comparative Measurement Study of Deep Learning as a Service Framework 2,264 views
Structural Agnostic SpMV: Adapting CSR-Adaptive for Irregular Matrices 2,264 views
NOVA: A Functional Language for Data Parallelism 2,264 views
Mars: Accelerating MapReduce with Graphics Processors 2,263 views
Implementation of a High Throughput 3GPP Turbo Decoder on GPU 2,263 views
A Cross-platform Evaluation of Graphics Shader Compiler Optimization 2,262 views
Barra, a Modular Functional GPU Simulator for GPGPU 2,262 views
wav2letter++: The Fastest Open-source Speech Recognition System 2,262 views
Parallel Statistical Multi-resolution Estimation 2,261 views
Sylkan: Towards a Vulkan Compute Target Platform for SYCL 2,261 views
Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines 2,261 views
Learning Structured Sparsity in Deep Neural Networks 2,261 views
A closer look at GPUs 2,261 views
VoxelPipe: a programmable pipeline for 3D voxelization 2,260 views
Behavioral Non-portability in Scientific Numeric Computing 2,260 views
Image Processing with CUDA 2,260 views
Efficient Execution of AMR Computations on GPU Systems 2,260 views
Speculative Execution on Multi-GPU Systems 2,259 views
DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications 2,259 views
Scalability of Self-organizing Maps on a GPU cluster using OpenCL and CUDA 2,259 views
Fast Simulation of Large-Scale Floods Based on GPU Parallel Computing 2,259 views
GPU-Accelerated Numerical Simulations of the Knudsen Gas on Time-Dependent Domains 2,258 views
GPGPU-Sim 2,258 views
A Generic Inverted Index Framework for Similarity Search on the GPU 2,258 views
The Ocean Tensor Package 2,258 views
Modeling Deep Learning Accelerator Enabled GPUs 2,258 views
GPU Ray Marching for Real-Time Rendering of Participating Media 2,257 views
Effective GPU Strategies for LU Decomposition 2,257 views
GPU-Accelerated BWT Construction for Large Collection of Short Reads 2,256 views
Massively parallel simulations of relativistic fluid dynamics on graphics processing units with CUDA 2,256 views
GPU Programming in a High Level Language: Compiling X10 to CUDA 2,256 views
Nodal Discontinuous Galerkin Methods on Graphics Processors 2,256 views
GPU Accelerated Particle Visualization with Splotch 2,255 views
pyMIC: A Python Offload Module for the Intel Xeon Phi Coprocessor 2,255 views
Comparing Parallel Hardware Architectures for Visually Guided Robot Navigation 2,255 views
Deep Learning for Computational Chemistry 2,255 views
Performance of Kepler GTX Titan GPUs and Xeon Phi System 2,255 views
GPU-accelerated Gibbs Sampling 2,254 views
GPU accelerated toolbox for real-time beam-shaping in multimode fibres 2,254 views
Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions 2,254 views
Chrono: a parallel multi-physics library for rigid-body, flexible-body, and fluid dynamics 2,254 views
GPU packet classification using OpenCL: a consideration of viable classification methods 2,253 views
Implementation of FDTD-Compatible Green’s Function on Heterogeneous CPU-GPU Parallel Processing System 2,253 views
Parallel LZ77 Decoding using a GPU 2,253 views
Adaboost GPU-based Classifier for Direct Volume Rendering 2,252 views
A High Performance Random Number Generator Using Heterogeneous Computing Platform 2,252 views
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin 2,252 views
Simulating the universe with GPU-accelerated supercomputers: n-body methods, tests, and examples 2,252 views
AMGCL: an Efficient, Flexible, and Extensible Algebraic Multigrid Implementation 2,252 views
A Compiler and Runtime for Heterogeneous Computing 2,252 views
Making Human Connectome Faster: GPU Acceleration of Brain Network Analysis 2,251 views
Fast parallel Particle-To-Grid interpolation for plasma PIC simulations on the GPU 2,251 views
An MPI-Based Python Framework for Distributed Training with Keras 2,250 views
Titles: 100
Total views: 226289
- Programming - 186,131 views
- Login - 164,415 views
- User dashboard - 90,774 views
- Paper titles list - 70,172 views
- Add new event - 64,602 views
- Add new post - 59,383 views
- Register - 49,237 views
- Statistics - 36,644 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,167 views
- Books on OpenCL and CUDA - 28,827 views