Views of posts on hgpu.org
Understanding the SIMD Efficiency of Graph Traversal on GPU 2,436 views
Performance Evaluation of Intel Xeon Phi Coprocessor using XKaapi 2,436 views
Probabilistic View-based 3D Curve Skeleton Computation on the GPU 2,435 views
Pyramid Methods in GPU-Based Image Processing 2,434 views
A Parallel GPU Version of the Traveling Salesman Problem 2,434 views
2-D Impulse Noise Suppression by Recursive Gaussian Maximum Likelihood Estimation 2,434 views
Large Scale Monte Carlo Tree Search on GPU 2,434 views
Implementation of medical image segmentation in CUDA 2,433 views
Implementing a Code Generator for Fast Matrix Multiplication in OpenCL on the GPU 2,433 views
Speculative Segmented Sum for Sparse Matrix-Vector Multiplication on Heterogeneous Processors 2,432 views
Large Graphs on multi-GPUs 2,432 views
Towards Lattice Quantum Chromodynamics on FPGA devices 2,432 views
Using CUDA for Exhaustive Password Recovery 2,431 views
On modelling of anisotropic viscoelasticity for soft tissue simulation: numerical solution and GPU execution 2,431 views
Reflective Shadow Map Clustering for Real-Time Global Illumination 2,431 views
Accelerate micromagnetic simulations with GPU programming in MATLAB 2,429 views
3D nonrigid registration via optimal mass transport on the GPU 2,429 views
Trainable Nonlinear Reaction Diffusion: A Flexible Framework for Fast and Effective Image Restoration 2,428 views
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems 2,428 views
Surface Reconstruction from Scattered Point via RBF Interpolation on GPU 2,428 views
Algorithmic GPGPU Memory Optimization 2,427 views
Parallelization of the x264 encoder using OpenCL 2,427 views
CT to Cone-beam CT Deformable Registration With Simultaneous Intensity Correction 2,427 views
Acceleration of k-Nearest Neighbor and SRAD Algorithms Using Intel FPGA SDK for OpenCL 2,426 views
Simulating and Visualizing Real-Time Crowds on GPU Clusters 2,426 views
Autotuning OpenCL Workgroup Size for Stencil Patterns 2,426 views
All-pairs shortest-paths for large graphs on the GPU 2,426 views
CUDA-C implementation of the ADER-DG method for linear hyperbolic PDEs 2,426 views
GPU Accelerated Graph SLAM and Occupancy Voxel Based ICP For Encoder-Free Mobile Robots 2,425 views
On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures 2,424 views
Computer Simulation of Saturn’s Ring Structure 2,424 views
Distributed GPU Password Cracking Research Project 2,423 views
Jump flooding in GPU with applications to Voronoi diagram and distance transform 2,423 views
Compute Pairwise Manhattan Distance and Pearson Correlation Coefficient of Data Points with GPU 2,423 views
Improving GPU Performance via Large Warps and Two-Level Warp Scheduling 2,423 views
Evaluation of Multi-Threading in Vulkan 2,422 views
Swan: A tool for porting CUDA programs to OpenCL 2,422 views
Tensor Contractions with Extended BLAS Kernels on CPU and GPU 2,421 views
CUDA Accelerated Robot Localization and Mapping 2,421 views
Design and Optimization of OpenFOAM-based CFD Applications for Modern Hybrid and Heterogeneous HPC Platforms 2,420 views
Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition 2,420 views
Real-time physically cloth simulation with CUDA 2,419 views
Fast k Nearest Neighbor Search using GPU 2,419 views
GAMER: a GPU-Accelerated Adaptive Mesh Refinement Code for Astrophysics 2,419 views
The gputools package enables GPU computing in R 2,419 views
ThunderSVM: A Fast SVM Library on GPUs and CPUs 2,418 views
DeepDSL: A Compilation-based Domain-Specific Language for Deep Learning 2,416 views
Data Structures and Algorithms for Counting Problems on Graphs using GPU 2,416 views
Simulating spiking neural networks on GPU 2,416 views
Effect of GPU Communication-Hiding for SpMV Using OpenACC 2,415 views
Signal Processing and General Purpose Computing on GPU 2,415 views
The GPU-based High-performance Pattern-matching Algorithm for Intrusion Detection 2,414 views
Finite-size scaling method for the Berezinskii-Kosterlitz-Thouless transition 2,414 views
2PARMA: Parallel Paradigms and Run-time Management Techniques for Many-Core Architectures 2,413 views
High-precision molecular dynamics simulation of UO2-PuO2: Anion self-diffusion in UO2 2,413 views
3D data denoising via Non-Local means filter by using parallel GPU strategies 2,413 views
Collaborative Diffusion on the GPU for Path-Finding in Games 2,413 views
iGPU: Exception Support and Speculative Execution on GPUs 2,412 views
CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication 2,412 views
Input Space Splitting for OpenCL 2,412 views
Hardware accelerators for biocomputing: A survey 2,411 views
SPRAT: Runtime processor selection for energy-aware computing 2,411 views
Clustering on GPU – A Brief Survey 2,410 views
Parakeet: A Just-In-Time Parallel Accelerator for Python 2,410 views
Comparison of OpenMP & OpenCL Parallel Processing Technologies 2,410 views
Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators 2,409 views
Volume Raycasting Performance Using DirectCompute 2,408 views
Unlocking Bandwidth for GPUs in CC-NUMA Systems 2,406 views
Integrating GPGPU computations with CPU coroutines in C++ 2,405 views
Spark-GPU: An Accelerated In-Memory Data Processing Engine on Clusters 2,405 views
BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images 2,404 views
Accelerating H.264 Advanced Video Coding with GPU/CUDA Technology 2,404 views
The implementation of Multi-Scale Retinex image enhancement algorithm based on GPU via CUDA 2,403 views
A Parallel Intermediate Representation for Embedded Languages 2,403 views
Toward optimised skeletons for heterogeneous parallel architecture with performance cost model 2,402 views
A 57mW embedded mixed-mode neuro-fuzzy accelerator for intelligent multi-core processor 2,402 views
Parallelization of SAT Algorithms on GPUs 2,402 views
Fast inference of deep neural networks in FPGAs for particle physics 2,400 views
Massively parallelizable list-mode reconstruction using a Monte Carlo-based elliptical Gaussian model 2,400 views
Parallel and Scalable Sparse Basic Linear Algebra Subprograms 2,400 views
“Local Rank Differences” Image Feature Implemented on GPU 2,399 views
3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs 2,399 views
GPU Cluster for High Performance Computing 2,398 views
A Comparative Study of OpenACC Implementations 2,398 views
Test-driving Intel Xeon Phi 2,397 views
A Survey on Compiler Autotuning using Machine Learning 2,396 views
Efficient deconvolution methods for astronomical imaging: algorithms and IDL-GPU codes 2,396 views
ThunderGBM: Fast GBDTs and Random Forests on GPUs 2,395 views
MPI Parallelization of GPU-based Lattice Boltzmann Simulations 2,395 views
Deep, Big, Simple Neural Nets for Handwritten Digit Recognition 2,395 views
Compute Distance Matrices with GPU 2,394 views
Deploying Graph Algorithms on GPUs: an Adaptive Solution 2,394 views
A scalable, numerically stable, high-performance tridiagonal solver using GPUs 2,394 views
Titles: 100
Total views: 241538
- Programming - 186,133 views
- Login - 164,571 views
- User dashboard - 91,324 views
- Paper titles list - 71,400 views
- Add new event - 64,819 views
- Add new post - 59,626 views
- Register - 49,323 views
- Statistics - 37,183 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,194 views
- Books on OpenCL and CUDA - 28,901 views