2402

Views of posts on hgpu.org

Stochastic Gradient Descent on GPUs  2,151 views

DIANNE: Distributed Artificial Neural Networks for the Internet of Things  2,150 views

Quantum computer simulation using the CUDA programming model  2,150 views

The ANTAREX Domain Specific Language for High Performance Computing  2,150 views

A Flexible Kernel for Adaptive Mesh Refinement on GPU  2,150 views

Discrete Wavelet Transform on Consumer-Level Graphics Hardware  2,149 views

Accelerating Image Retrieval Using Factorial Correspondence Analysis on GPU  2,149 views

An MPI-CUDA Implementation and Optimization for Parallel Sparse Equations and Least Squares (LSQR)  2,149 views

Using GPU Simulation to Accurately Fit to the Power-Law Distribution  2,149 views

Porting to the Intel Xeon Phi: Opportunities and Challenges  2,149 views

Faster across the PCIe bus: A GPU library for lightweight decompression  2,149 views

Dense Matrix Algebra on the GPU  2,149 views

A comparison of period finding algorithms  2,149 views

MIC-SVM: Designing A Highly Efficient Support Vector Machine For Advanced Modern Multi-Core and Many-Core Architectures  2,149 views

Fast Parallel Image Registration on CPU and GPU for Diagnostic Classification of Alzheimer’s Disease  2,148 views

Pannotia: Understanding Irregular GPGPU Graph Applications  2,148 views

High Level Programming for Heterogeneous Architectures  2,148 views

High Performance Portable Tsunami Simulations on Many-core CPU, GPU, and FPGA  2,148 views

Strategies for Maximizing Utilization in multi-CPU & multi-GPU Heterogeneous Architectures  2,148 views

CUDA Based Enhanced Differential Evolution: a Computational Analysis  2,148 views

Fast Implementation of Two Hash Algorithms on nVidia CUDA GPU  2,148 views

Generic System Calls for GPUs  2,147 views

Hauberk: Lightweight Silent Data Corruption Error Detector for GPGPU  2,147 views

Multi GPU Performance of Conjugate Gradient Solver with Staggered Fermions in Mixed Precision  2,147 views

Using Graphics Processing Unit to Accelerate Database Query Execution  2,147 views

A fight for performance and accuracy of the matrix multiplication routines: CUBLAS on Nvidia Tesla versus MKL and ATLAS on Intel Nehalem  2,147 views

Auto-tuning 3-D FFT library for CUDA GPUs  2,147 views

Accelerated GPU Powered Methods for Auditing Security of Wireless Networks Using Probabilistic Password Generation  2,147 views

Influence of InfiniBand FDR on the Performance of Remote GPU Virtualization  2,146 views

Key Reconciliation with Low-Density Parity-Check Codes for Long-Distance Quantum Cryptography  2,146 views

Mapping parallel programs to heterogeneous multi-core systems  2,146 views

Very fast ellipse detection using GPU-based RHT  2,146 views

Blum Blum Shub on the GPU  2,146 views

A Novel Mapping of Arbitrary Precision Integer Operations to the GPU  2,146 views

CuPP – A framework for easy CUDA integration  2,145 views

Scheduling on Manycore and Heterogeneous Graphics Processors  2,144 views

PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming  2,144 views

Heterogeneous Clustering with Homogeneous Code: Accelerate MPI Applications Without Code Surgery Using Intel Xeon Phi Coprocessors  2,144 views

Improving GPU Sparse Matrix-Vector Multiplication for Probabilistic Model Checking  2,144 views

Hardware thread reordering to boost OpenCL throughput on FPGAs  2,144 views

Spatter: A Benchmark Suite for Evaluating Sparse Access Patterns  2,144 views

Investigating performance variations of an optimized GPU-ported granulometry algorithm  2,143 views

Multi-GPU Island-Based Genetic Algorithm for Solving the Knapsack Problem  2,143 views

Computing Prestack Kirchhoff Time Migration on General Purpose GPU  2,143 views

Seismic damage simulation for urban buildings based on high-performance GPU computing  2,143 views

Parallel Graph Component Labelling with GPUs and CUDA  2,143 views

Computation of the Isogeometric Analysis Stiffness Matrix on GPU  2,143 views

A Scalable Framework for Monte Carlo Simulation Using FPGA-based Hardware Accelerators with Application to SPECT Imaging  2,143 views

Exploiting Task Parallelism with OpenCL: A Case Study  2,142 views

A Comprehensive Performance Comparison of CUDA and OpenCL  2,142 views

On algorithmic reductions in task-parallel programming models  2,142 views

Efficient Preconditioned Conjugate Gradient Parallelization on GPU  2,141 views

Single-Pass GPU-Raycasting for Structured Adaptive Mesh Refinement Data  2,141 views

A Survey Of Techniques for Approximate Computing  2,141 views

A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems  2,141 views

A tutorial overview on the properties of the discrete cosine transform for encoded image and video processing  2,140 views

Parallelization of KMP String Matching Algorithm on Different SIMD architectures: Multi-Core and GPGPU’s  2,140 views

Acceleration of a QM/MM-QMC simulation using GPU  2,140 views

Realistic Lighting Simulation for Interactive VR Applications  2,140 views

Parallel Cloth Simulation Using OpenMP and CUDA  2,139 views

A Smart GPU Implementation of an Elliptic Kernel for an Ocean Global Circulation Model  2,139 views

Performance Comparison of Cholesky Decomposition on GPUs and FPGAs  2,139 views

Data Parallel Quadtree Indexing and Spatial Query Processing of Complex Polygon Data on GPUs  2,139 views

3D Information Extraction Based on GPU  2,138 views

The Parallel Processing Based on CUDA for Convolution Filter FDK Reconstruction of CT  2,138 views

CLgrep: A Parallel String Matching Tool  2,138 views

A Comparison of Gradient Estimation Methods for Volume Rendering on Unstructured Meshes  2,138 views

Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models  2,137 views

High precision integer multiplication with a graphics processing unit  2,137 views

DEF-G: Declarative Framework for GPU Environment  2,137 views

Development and evaluation of a GPU-optimized N-body term for the simulation of biomolecules  2,137 views

High-quality surface splatting on today’s GPUs  2,137 views

Scaling up scientific computations by using map-reduce-like control flow on NUMA architectures  2,136 views

Multi-swarm PSO algorithm for the Quadratic Assignment Problem: a massive parallel implementation on the OpenCL platform  2,136 views

Accelerating Interpreted Programming Languages on GPUs with Just-In-Time Compilation and Runtime Optimisations  2,135 views

Vectorized OpenCL implementation of numerical integration for higher order finite elements  2,135 views

Performance evaluation of H.264/AVC decoding and visualization using the GPU  2,134 views

Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation  2,134 views

Tight Binding Molecular Dynamics on CPU and GPU clusters  2,134 views

dOpenCL – Evaluation of an API-Forwarding Implementation  2,134 views

Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs  2,134 views

Implementation of Spectral Angle Mapper (SAM) Algorithm on a Graphic processing unit (GPU)  2,134 views

Performance Optimization of Vision Apps on Mobile Application Processor  2,133 views

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism  2,133 views

Accelerated protein structure comparison using TM-score-GPU  2,133 views

GraviDy: a GPU modular, parallel N-body integrator  2,133 views

GPU-based Numerical Integration in the Partition of Unity Method  2,133 views

Fast GPGPU-Based Elliptic Curve Scalar Multiplication  2,133 views

Non-Local Total Generalized Variation for Optical Flow Estimation  2,133 views

Fast 3D Wavelet Transform on Multicore and Manycore Computing Platforms  2,133 views

Register packing for cyclic reduction: a case study  2,132 views

CUKNN: A parallel implementation of K-nearest neighbor on CUDA-enabled GPU  2,132 views

Scaling Monte Carlo Tree Search on Intel Xeon Phi  2,132 views

An Efficient GPU Implementation of Modified Discrete Cosine Transform Using CUDA  2,132 views

GPU accelerated Monte Carlo simulation of Brownian motors dynamics with CUDA  2,132 views

Energy-based Tuning of Convolutional Neural Networks on Multi-GPUs  2,131 views

Energy-efficient algorithms  2,131 views

Flexible Software Profiling of GPU Architectures  2,131 views

Resolution of the Vlasov-Maxwell system by PIC Discontinuous Galerkin method on GPU with OpenCL  2,131 views

Cone-beam Computed tomography image reconstruction based on GPU  2,131 views

 

Brief statistics for this page

Titles: 100

Total views: 214111

 

Most viewed items:

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: