Views of posts on hgpu.org
GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training 2,693 views
A GPU-Based Wide-Band Radio Spectrometer 2,692 views
Unsupervised Asset Cluster Analysis Implemented with Parallel Genetic Algorithms on the NVIDIA CUDA Platform 2,689 views
GPU Accelerated Greedy Algorithms for Compressed Sensing 2,688 views
Multi-GPU Acceleration of Black-Scholes Equation based Option Pricing 2,688 views
GPU Implementation of the Particle Filter 2,685 views
Finite Pointset Method for 2D Dam-Break Problem with GPU-Acceleration 2,685 views
G-SNPM – A GPU-based SNP mapping tool 2,684 views
A stand-alone Finite Difference Time Domain (FDTD) simulation for Integrated Optoelectronics Laboratory 2,683 views
3DES ECB Optimized for Massively Parallel CUDA GPU Architecture 2,682 views
High Throughput Low Latency LDPC Decoding on GPU for SDR Systems 2,681 views
The Accelerator Wall: Limits of Chip Specialization 2,680 views
Evolution of thread-level parallelism in desktop applications 2,680 views
FPGA and GPU implementation of large scale SpMV 2,679 views
Precomputed Atmospheric Scattering 2,678 views
Image Denoising Using Wavelet Transform and CUDA 2,678 views
Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths 2,674 views
Hybrid CPU-GPU Implementation of Tracking-Learning-Detection Algorithm 2,673 views
cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on a GPU 2,672 views
Efficient softmax approximation for GPUs 2,671 views
Implementing density functional theory (DFT) methods on many-core GPGPU accelerators 2,670 views
Ray Tracing on GPUs 2,670 views
Local Laplacian Filters: Edge-aware Image Processing with a Laplacian Pyramid 2,670 views
3D Recursive Gaussian IIR on GPU and FPGAs: A Case Study for Accelerating Bandwidth-Bounded Applications 2,669 views
Multi-Scale, Multi-Level, Heterogeneous Features Extraction and Classification of Volumetric Medical Images 2,669 views
Parallel Implementations of the Cholesky Decomposition on CPUs and GPUs 2,667 views
Fast and Flexible: Parallel Packet Processing with GPUs and Click 2,665 views
CUDA Based CAMshift Algorithm for Object Tracking Systems 2,665 views
Swendsen-Wang Multi-Cluster Algorithm for the 2D/3D Ising Model on Xeon Phi and GPU 2,665 views
Accelerating IISPH: A Parallel GPGPU Solution Using CUDA 2,661 views
VHF SAR image formation implemented on a GPU 2,660 views
Convex Clustering: An Attractive Alternative to Hierarchical Clustering 2,659 views
An Introduction to OpenCL C++ 2,659 views
Auto-tunable GPU BLAS 2,658 views
Poseidon: A System Architecture for Efficient GPU-based Deep Learning on Multiple Machines 2,658 views
String Matching on a Multicore GPU Using CUDA 2,657 views
GPU Sparse Matrix Multiplication with CUDA 2,656 views
Data Layout Oriented Compilation Techniques in Vectorization for Multi-/Many-cores 2,655 views
Computing Strongly Connected Components with CUDA 2,654 views
A Data-Parallel Graphics Pipeline Implemented in OpenCL 2,652 views
Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations 2,652 views
Cloth Simulation on the GPU 2,652 views
KERNELGEN – A Toolchain for Automatic GPU-centric Applications Porting 2,652 views
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA 2,651 views
GPGPU Performance Estimation with Core and Memory Frequency Scaling 2,651 views
Fast Mersenne prime testing on the GPU 2,651 views
A Performance Comparison of CUDA and OpenCL 2,649 views
Theano-based Large-Scale Visual Recognition with Multiple GPUs 2,649 views
Implementation of the genetic algorithm by means of CUDA technology involved in travelling salesman problem 2,649 views
Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning 2,647 views
Performance Analysis of Parallel Sorting Algorithms using GPU Computing 2,645 views
Real-time Image Processing on Low Cost Embedded Computers 2,644 views
A Parallel Algorithm of PCA-SIFT Based on CUDA 2,644 views
Deep Feature-based Face Detection on Mobile Devices 2,644 views
Warp-Level Divergence in GPUs: Characterization, Impact, and Mitigation 2,643 views
A GEMM interface and implementation on NVIDIA GPUs for multiple small matrices 2,643 views
AES and DES Encryption with GPU 2,643 views
GPUGI: Global Illumination Effects on the GPU 2,643 views
OpenCL Performance Prediction using Architecture-Independent Features 2,642 views
Increasing GPU Throughput using Kernel Interleaved Thread Block Scheduling 2,641 views
DeepSpeech: Scaling up end-to-end speech recognition 2,641 views
The Comparisons of OpenCL and OpenMP Computing Paradigm 2,641 views
Efficient Parallel Methods for Deep Reinforcement Learning 2,640 views
A framework to implement a multifrontal scheme on GPU architectures with OpenCL 2,637 views
GPU Virtualization 2,637 views
Deep convolutional networks for pancreas segmentation in CT imaging 2,637 views
Fractal Based Method on Hardware Acceleration for Natural Environments 2,636 views
Dynamic Memory Allocation for OpenCL 2,636 views
fastHOG – a real-time GPU implementation of HOG 2,636 views
LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks 2,635 views
Molecular dynamics simulations through GPU video games technologies 2,634 views
MCS 572: Introduction to Supercomputing 2,634 views
Best Practice Guide Intel Xeon Phi v2.0 2,633 views
A Predictive Model for Solving Small Linear Algebra Problems in GPU Registers 2,633 views
Programming Frameworks for Distributed Smartphone Computing 2,633 views
Parallel Execution of AES-CTR Algorithm Using Extended Block Size 2,632 views
FastSpMM: An Efficient Library for Sparse Matrix Matrix Product on GPUs 2,631 views
GPU Accelerated Face Detection (thesis) 2,629 views
Hardware Transactional Memory for GPU Architectures 2,629 views
On the Fly Porn Video Blocking Using Distributed Multi-GPU and Data Mining Approach 2,629 views
GPU-Based Airway Tree Segmentation and Centerline Extraction 2,629 views
Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels 2,628 views
Face Recognition Using OpenCL 2,627 views
Achieving TeraCUPS on Longest Common Subsequence Problem using GPGPUs 2,627 views
GPUburn: A System to Test and Mitigate GPU Hardware Failures 2,626 views
Parallelization of the Generalized Hough Transform on GPU 2,625 views
Numerical Computations with GPUs 2,624 views
Fast Burrows Wheeler Compression Using CPU and GPU 2,623 views
Synergia CUDA: GPU-accelerated accelerator modeling package 2,623 views
A short guide to CUDA C: For physicists with multi-core graphics cards 2,622 views
XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures 2,622 views
On Vectorization of Deep Convolutional Neural Networks for Vision Tasks 2,622 views
OpenCL C++ 2,621 views
A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads 2,621 views
libmolgrid: GPU Accelerated Molecular Gridding for Deep Learning Applications 2,619 views
Titles: 100
Total views: 265165
- Programming - 186,131 views
- Login - 164,409 views
- User dashboard - 90,767 views
- Paper titles list - 70,168 views
- Add new event - 64,599 views
- Add new post - 59,379 views
- Register - 49,237 views
- Statistics - 36,639 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,167 views
- Books on OpenCL and CUDA - 28,826 views