Views of posts on hgpu.org
Stochastic Gradient Descent on GPUs 2,151 views
DIANNE: Distributed Artificial Neural Networks for the Internet of Things 2,150 views
Quantum computer simulation using the CUDA programming model 2,150 views
The ANTAREX Domain Specific Language for High Performance Computing 2,150 views
A Flexible Kernel for Adaptive Mesh Refinement on GPU 2,150 views
Discrete Wavelet Transform on Consumer-Level Graphics Hardware 2,149 views
Accelerating Image Retrieval Using Factorial Correspondence Analysis on GPU 2,149 views
An MPI-CUDA Implementation and Optimization for Parallel Sparse Equations and Least Squares (LSQR) 2,149 views
Using GPU Simulation to Accurately Fit to the Power-Law Distribution 2,149 views
Porting to the Intel Xeon Phi: Opportunities and Challenges 2,149 views
Faster across the PCIe bus: A GPU library for lightweight decompression 2,149 views
Dense Matrix Algebra on the GPU 2,149 views
A comparison of period finding algorithms 2,149 views
Fast Parallel Image Registration on CPU and GPU for Diagnostic Classification of Alzheimer’s Disease 2,148 views
Pannotia: Understanding Irregular GPGPU Graph Applications 2,148 views
High Level Programming for Heterogeneous Architectures 2,148 views
High Performance Portable Tsunami Simulations on Many-core CPU, GPU, and FPGA 2,148 views
Strategies for Maximizing Utilization in multi-CPU & multi-GPU Heterogeneous Architectures 2,148 views
CUDA Based Enhanced Differential Evolution: a Computational Analysis 2,148 views
Fast Implementation of Two Hash Algorithms on nVidia CUDA GPU 2,148 views
Generic System Calls for GPUs 2,147 views
Hauberk: Lightweight Silent Data Corruption Error Detector for GPGPU 2,147 views
Multi GPU Performance of Conjugate Gradient Solver with Staggered Fermions in Mixed Precision 2,147 views
Using Graphics Processing Unit to Accelerate Database Query Execution 2,147 views
Auto-tuning 3-D FFT library for CUDA GPUs 2,147 views
Influence of InfiniBand FDR on the Performance of Remote GPU Virtualization 2,146 views
Key Reconciliation with Low-Density Parity-Check Codes for Long-Distance Quantum Cryptography 2,146 views
Mapping parallel programs to heterogeneous multi-core systems 2,146 views
Very fast ellipse detection using GPU-based RHT 2,146 views
Blum Blum Shub on the GPU 2,146 views
A Novel Mapping of Arbitrary Precision Integer Operations to the GPU 2,146 views
CuPP – A framework for easy CUDA integration 2,145 views
Scheduling on Manycore and Heterogeneous Graphics Processors 2,144 views
PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming 2,144 views
Improving GPU Sparse Matrix-Vector Multiplication for Probabilistic Model Checking 2,144 views
Hardware thread reordering to boost OpenCL throughput on FPGAs 2,144 views
Spatter: A Benchmark Suite for Evaluating Sparse Access Patterns 2,144 views
Investigating performance variations of an optimized GPU-ported granulometry algorithm 2,143 views
Multi-GPU Island-Based Genetic Algorithm for Solving the Knapsack Problem 2,143 views
Computing Prestack Kirchhoff Time Migration on General Purpose GPU 2,143 views
Seismic damage simulation for urban buildings based on high-performance GPU computing 2,143 views
Parallel Graph Component Labelling with GPUs and CUDA 2,143 views
Computation of the Isogeometric Analysis Stiffness Matrix on GPU 2,143 views
Exploiting Task Parallelism with OpenCL: A Case Study 2,142 views
A Comprehensive Performance Comparison of CUDA and OpenCL 2,142 views
On algorithmic reductions in task-parallel programming models 2,142 views
Efficient Preconditioned Conjugate Gradient Parallelization on GPU 2,141 views
Single-Pass GPU-Raycasting for Structured Adaptive Mesh Refinement Data 2,141 views
A Survey Of Techniques for Approximate Computing 2,141 views
A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems 2,141 views
Parallelization of KMP String Matching Algorithm on Different SIMD architectures: Multi-Core and GPGPU’s 2,140 views
Acceleration of a QM/MM-QMC simulation using GPU 2,140 views
Realistic Lighting Simulation for Interactive VR Applications 2,140 views
Parallel Cloth Simulation Using OpenMP and CUDA 2,139 views
A Smart GPU Implementation of an Elliptic Kernel for an Ocean Global Circulation Model 2,139 views
Performance Comparison of Cholesky Decomposition on GPUs and FPGAs 2,139 views
Data Parallel Quadtree Indexing and Spatial Query Processing of Complex Polygon Data on GPUs 2,139 views
3D Information Extraction Based on GPU 2,138 views
The Parallel Processing Based on CUDA for Convolution Filter FDK Reconstruction of CT 2,138 views
CLgrep: A Parallel String Matching Tool 2,138 views
A Comparison of Gradient Estimation Methods for Volume Rendering on Unstructured Meshes 2,138 views
Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models 2,137 views
High precision integer multiplication with a graphics processing unit 2,137 views
DEF-G: Declarative Framework for GPU Environment 2,137 views
Development and evaluation of a GPU-optimized N-body term for the simulation of biomolecules 2,137 views
High-quality surface splatting on today’s GPUs 2,137 views
Scaling up scientific computations by using map-reduce-like control flow on NUMA architectures 2,136 views
Vectorized OpenCL implementation of numerical integration for higher order finite elements 2,135 views
Performance evaluation of H.264/AVC decoding and visualization using the GPU 2,134 views
Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation 2,134 views
Tight Binding Molecular Dynamics on CPU and GPU clusters 2,134 views
dOpenCL – Evaluation of an API-Forwarding Implementation 2,134 views
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs 2,134 views
Implementation of Spectral Angle Mapper (SAM) Algorithm on a Graphic processing unit (GPU) 2,134 views
Performance Optimization of Vision Apps on Mobile Application Processor 2,133 views
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism 2,133 views
Accelerated protein structure comparison using TM-score-GPU 2,133 views
GraviDy: a GPU modular, parallel N-body integrator 2,133 views
GPU-based Numerical Integration in the Partition of Unity Method 2,133 views
Fast GPGPU-Based Elliptic Curve Scalar Multiplication 2,133 views
Non-Local Total Generalized Variation for Optical Flow Estimation 2,133 views
Fast 3D Wavelet Transform on Multicore and Manycore Computing Platforms 2,133 views
Register packing for cyclic reduction: a case study 2,132 views
CUKNN: A parallel implementation of K-nearest neighbor on CUDA-enabled GPU 2,132 views
Scaling Monte Carlo Tree Search on Intel Xeon Phi 2,132 views
An Efficient GPU Implementation of Modified Discrete Cosine Transform Using CUDA 2,132 views
GPU accelerated Monte Carlo simulation of Brownian motors dynamics with CUDA 2,132 views
Energy-based Tuning of Convolutional Neural Networks on Multi-GPUs 2,131 views
Energy-efficient algorithms 2,131 views
Flexible Software Profiling of GPU Architectures 2,131 views
Resolution of the Vlasov-Maxwell system by PIC Discontinuous Galerkin method on GPU with OpenCL 2,131 views
Cone-beam Computed tomography image reconstruction based on GPU 2,131 views
Titles: 100
Total views: 214111
- Programming - 186,133 views
- Login - 164,571 views
- User dashboard - 91,322 views
- Paper titles list - 71,383 views
- Add new event - 64,819 views
- Add new post - 59,626 views
- Register - 49,322 views
- Statistics - 37,182 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,194 views
- Books on OpenCL and CUDA - 28,901 views