Views of posts on hgpu.org
An MPI-Based Python Framework for Distributed Training with Keras 2,250 views
ART vs. NDK vs. GPU acceleration: A study of performance of image processing algorithms on Android 2,249 views
Subdivision Surface Evaluation as Sparse Matrix-Vector Multiplication 2,249 views
A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels 2,249 views
Energy Efficiency Analysis of GPUs 2,248 views
Mars: a MapReduce framework on graphics processors 2,248 views
Graph Coarsening and Clustering on the GPU 2,247 views
MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC) 2,247 views
Parallel Genetic Algorithms on a GPU to Solve the Travelling Salesman Problem 2,247 views
Reducing branch divergence in GPU programs 2,247 views
GPU Accelerated Automated Feature Extraction from Satellite Images 2,247 views
The Dynamical Kernel Scheduler – Part 1 2,247 views
Fast Hamiltonian Monte Carlo Using GPU Computing 2,247 views
gpuSPHASE – A shared memory caching implementation for 2D SPH using CUDA 2,246 views
ClearPath: highly parallel collision avoidance for multi-agent simulation 2,246 views
PFAC Library: GPU-based string matching algorithm 2,246 views
Characterising Across-Stack Optimisations for Deep Convolutional Neural Networks 2,246 views
ASAMgpu V1.0-a moist fully compressible atmospheric model using graphics processing units (GPUs) 2,246 views
Real-Time Plane-Sweeping Stereo with Multiple Sweeping Directions 2,246 views
GPU Acceleration of Runge-Kutta Integrators 2,246 views
Performance and Power Comparisons Between Nvidia and ATI GPUs 2,246 views
Accelerating MATLAB Image Processing Toolbox functions on GPUs 2,246 views
Accelerating Habanero-Java Programs with OpenCL Generation 2,246 views
A GPU-based architecture for real-time data assessment at synchrotron experiments 2,246 views
A Code Transformation Framework for Scientific Applications on Structured Grids 2,245 views
Face Detection on CUDA 2,245 views
An ultrasonic imaging system based on a new SAFT approach and a GPU beamformer 2,245 views
Device specialization in heterogeneous multi-GPU environments 2,244 views
Research on CUDA-based Kriging Interpolation Algorithm 2,244 views
Implementation of Smith-Waterman Algorithm in OpenCL for GPUs 2,244 views
The AES Implantation Based on OpenCL for Multi/many Core Architecture 2,244 views
Fast Predictive Image Registration 2,243 views
A characterization and analysis of PTX kernels 2,243 views
Fast TV-L1 Optical Flow for Interactivity 2,243 views
Lattice Boltzmann Simulations on a GPU: An optimization approach using C++ AMP 2,242 views
Image Encryption Using Parallel RSA Algorithm on CUDA 2,242 views
FeCaffe: FPGA-enabled Caffe with OpenCL for Deep Learning Training and Inference on Intel Stratix 10 2,242 views
Design and Optimization of Image Processing Algorithms on Mobile GPU 2,241 views
B-CALM: An open-source GPU-based 3D-FDTD with multi-pole dispersion for plasmonics 2,241 views
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model 2,241 views
Enabling Fast, Noncontiguous GPU Data Movement in Hybrid MPI+GPU Environments 2,240 views
Fast GPU-based calculations in few-body quantum scattering 2,240 views
The battle of the giants: a case study of GPU vs FPGA optimisation for real-time image processing 2,240 views
On the Way to Future’s High Energy Particle Physics Transport Code 2,240 views
GPU-Based Computation of Discrete Periodic Centroidal Voronoi Tessellation in Hyperbolic Space 2,239 views
HyPHI – task based hybrid execution C++ library for the Intel Xeon Phi coprocessor 2,239 views
A GPU-based hyperbolic SVD algorithm 2,239 views
StePS: A Multi-GPU Cosmological N-body Code for Compactified Simulations 2,238 views
On Binaural Spatialization and the Use of GPGPU for Audio Processing 2,238 views
A Comparison of the performance of HPC Accelerators 2,238 views
Geometric Algebra Enhanced Precompiler for C++, OpenCL and Mathematica’s OpenCLLink 2,238 views
GPU-BSM: A GPU-Based Tool to Map Bisulfite-Treated Reads 2,237 views
Speeding up Large-Scale Point-in-Polygon Test Based Spatial Join on GPUs 2,237 views
Microbenchmarks for GPU characteristics: the occupancy roofline and the pipeline model 2,237 views
Intel FPGA SDK for OpenCL 2,237 views
Ray Reordering Techniques for GPU Ray-Cast Ambient Occlusion 2,237 views
GPU Based Generation and Real-Time Rendering of Semi-Procedural Terrain Using Features 2,237 views
On Reinforcement Learning for Full-length Game of StarCraft 2,237 views
An architecture for real time fluid simulation using multiple GPUs 2,236 views
Parallelization and Performance of the NIM Weather Model for CPU, GPU and MIC Processors 2,236 views
Cascaded Segmentation-Detection Networks for Word-Level Text Spotting 2,236 views
GPU Implementation of the DP code 2,235 views
Accelerating SQL Database Operations on a GPU with CUDA 2,235 views
GPU implemention of fast Gabor filters 2,235 views
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS 2,235 views
Sparse Matrix-Vector Multiplication on GPGPUs 2,234 views
Locality optimization on a NUMA architecture for hybrid LU factorization 2,233 views
GPU Programming for Physics Applications 2,233 views
Model-driven autotuning of sparse matrix-vector multiply on GPUs 2,233 views
Shared Sampling for Real-Time Alpha Matting 2,232 views
GPU concurrency: Weak behaviours and programming assumptions 2,232 views
GGNN: Graph-based GPU Nearest Neighbor Search 2,232 views
Fast seismic modeling and Reverse Time Migration on a GPU cluster 2,231 views
Non-local means denoising algorithm accelerated by GPU 2,231 views
Optimization and Parallelization Methods for the Design of Next-Generation Radio Networks 2,231 views
How to scale distributed deep learning? 2,230 views
Parallel Chen-Han (PCH) Algorithm for Discrete Geodesics 2,230 views
Solving Linear Recurrences on Hybrid GPU Accelerated Manycore Systems 2,229 views
HSApriori: High Speed Association Rule Mining using Apriori Based Algorithm for GPU 2,229 views
A Quantitative Study of Irregular Programs on GPUs 2,229 views
The Performance Analysis Based on Heterogeneous Parallel Processors for Anisotropic Diffusion Filters 2,229 views
An In-depth Performance Characterization of CPU- and GPU-based DNN Training on Modern Architectures 2,229 views
XSD: Accelerating MapReduce by Harnessing the GPU inside an SSD 2,228 views
Implementing Ultrasound Beamforming on the GPU using CUDA 2,228 views
Mersenne Twister Random Number Generation on FPGA, CPU and GPU 2,228 views
Designing a Unified Programming Model for Heterogeneous Machines 2,228 views
A Quantitative Performance Analysis Model for GPU Architectures 2,228 views
Large-Scale Compute-Intensive Analysis via a Combined In-Situ and Co-Scheduling Workflow Approach 2,228 views
Digital beamforming using a GPU 2,227 views
Interactive volumetric lighting simulating scattering and shadowing 2,227 views
Persistent RNNs: Stashing Recurrent Weights On-Chip 2,227 views
Computation of gray-level co-occurrence matrix based on CUDA and its optimization 2,227 views
Computing the distance between two finite element solutions defined on different 3D meshes on a GPU 2,227 views
Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures 2,226 views
Titles: 100
Total views: 223854
- Programming - 186,131 views
- Login - 164,415 views
- User dashboard - 90,773 views
- Paper titles list - 70,172 views
- Add new event - 64,601 views
- Add new post - 59,382 views
- Register - 49,237 views
- Statistics - 36,643 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,167 views
- Books on OpenCL and CUDA - 28,827 views