2402

Views of posts on hgpu.org

An MPI-Based Python Framework for Distributed Training with Keras  2,250 views

ART vs. NDK vs. GPU acceleration: A study of performance of image processing algorithms on Android  2,249 views

Subdivision Surface Evaluation as Sparse Matrix-Vector Multiplication  2,249 views

Recurrent Neural Networks for anomaly detection in the Post-Mortem time series of LHC superconducting magnets  2,249 views

A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels  2,249 views

Energy Efficiency Analysis of GPUs  2,248 views

Mars: a MapReduce framework on graphics processors  2,248 views

Graph Coarsening and Clustering on the GPU  2,247 views

Mapping the SBR and TW-ILDCs to Heterogeneous CPU-GPU Architecture for Fast Computation of Electromagnetic Scattering  2,247 views

MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC)  2,247 views

Parallel Genetic Algorithms on a GPU to Solve the Travelling Salesman Problem  2,247 views

Reducing branch divergence in GPU programs  2,247 views

GPU Accelerated Automated Feature Extraction from Satellite Images  2,247 views

The Dynamical Kernel Scheduler – Part 1  2,247 views

Fast Hamiltonian Monte Carlo Using GPU Computing  2,247 views

gpuSPHASE – A shared memory caching implementation for 2D SPH using CUDA  2,246 views

ClearPath: highly parallel collision avoidance for multi-agent simulation  2,246 views

PFAC Library: GPU-based string matching algorithm  2,246 views

Characterising Across-Stack Optimisations for Deep Convolutional Neural Networks  2,246 views

ASAMgpu V1.0-a moist fully compressible atmospheric model using graphics processing units (GPUs)  2,246 views

Real-Time Plane-Sweeping Stereo with Multiple Sweeping Directions  2,246 views

GPU Acceleration of Runge-Kutta Integrators  2,246 views

Performance and Power Comparisons Between Nvidia and ATI GPUs  2,246 views

Accelerating MATLAB Image Processing Toolbox functions on GPUs  2,246 views

Accelerating Habanero-Java Programs with OpenCL Generation  2,246 views

A Map-Reduce-Like System for Programming and Optimizing Data-Intensive Computations on Emerging Parallel Architectures  2,246 views

A GPU-based architecture for real-time data assessment at synchrotron experiments  2,246 views

A Code Transformation Framework for Scientific Applications on Structured Grids  2,245 views

Face Detection on CUDA  2,245 views

An ultrasonic imaging system based on a new SAFT approach and a GPU beamformer  2,245 views

Device specialization in heterogeneous multi-GPU environments  2,244 views

Research on CUDA-based Kriging Interpolation Algorithm  2,244 views

Implementation of Smith-Waterman Algorithm in OpenCL for GPUs  2,244 views

The AES Implantation Based on OpenCL for Multi/many Core Architecture  2,244 views

Fast Predictive Image Registration  2,243 views

A characterization and analysis of PTX kernels  2,243 views

Fast TV-L1 Optical Flow for Interactivity  2,243 views

Lattice Boltzmann Simulations on a GPU: An optimization approach using C++ AMP  2,242 views

Image Encryption Using Parallel RSA Algorithm on CUDA  2,242 views

FeCaffe: FPGA-enabled Caffe with OpenCL for Deep Learning Training and Inference on Intel Stratix 10  2,242 views

ImageCL: Language and source-to-source compiler for performance portability, load balancing, and scalability prediction on heterogeneous systems  2,241 views

Design and Optimization of Image Processing Algorithms on Mobile GPU  2,241 views

B-CALM: An open-source GPU-based 3D-FDTD with multi-pole dispersion for plasmonics  2,241 views

GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model  2,241 views

Enabling Fast, Noncontiguous GPU Data Movement in Hybrid MPI+GPU Environments  2,240 views

Fast GPU-based calculations in few-body quantum scattering  2,240 views

The battle of the giants: a case study of GPU vs FPGA optimisation for real-time image processing  2,240 views

On the Way to Future’s High Energy Particle Physics Transport Code  2,240 views

GPU-Based Computation of Discrete Periodic Centroidal Voronoi Tessellation in Hyperbolic Space  2,239 views

HyPHI – task based hybrid execution C++ library for the Intel Xeon Phi coprocessor  2,239 views

A GPU-based hyperbolic SVD algorithm  2,239 views

StePS: A Multi-GPU Cosmological N-body Code for Compactified Simulations  2,238 views

On Binaural Spatialization and the Use of GPGPU for Audio Processing  2,238 views

A Comparison of the performance of HPC Accelerators  2,238 views

Geometric Algebra Enhanced Precompiler for C++, OpenCL and Mathematica’s OpenCLLink  2,238 views

GPU-BSM: A GPU-Based Tool to Map Bisulfite-Treated Reads  2,237 views

Speeding up Large-Scale Point-in-Polygon Test Based Spatial Join on GPUs  2,237 views

Microbenchmarks for GPU characteristics: the occupancy roofline and the pipeline model  2,237 views

Intel FPGA SDK for OpenCL  2,237 views

Ray Reordering Techniques for GPU Ray-Cast Ambient Occlusion  2,237 views

GPU Based Generation and Real-Time Rendering of Semi-Procedural Terrain Using Features  2,237 views

On Reinforcement Learning for Full-length Game of StarCraft  2,237 views

An architecture for real time fluid simulation using multiple GPUs  2,236 views

Parallelization and Performance of the NIM Weather Model for CPU, GPU and MIC Processors  2,236 views

Cascaded Segmentation-Detection Networks for Word-Level Text Spotting  2,236 views

Teaching cardiac electrophysiology modeling to undergraduate students: laboratory exercises and GPU programming for the study of arrhythmias and spiral wave dynamics  2,235 views

GPU Implementation of the DP code  2,235 views

Accelerating SQL Database Operations on a GPU with CUDA  2,235 views

GPU implemention of fast Gabor filters  2,235 views

Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS  2,235 views

Sparse Matrix-Vector Multiplication on GPGPUs  2,234 views

Locality optimization on a NUMA architecture for hybrid LU factorization  2,233 views

GPU Programming for Physics Applications  2,233 views

Model-driven autotuning of sparse matrix-vector multiply on GPUs  2,233 views

Shared Sampling for Real-Time Alpha Matting  2,232 views

GPU concurrency: Weak behaviours and programming assumptions  2,232 views

GGNN: Graph-based GPU Nearest Neighbor Search  2,232 views

Fast seismic modeling and Reverse Time Migration on a GPU cluster  2,231 views

Non-local means denoising algorithm accelerated by GPU  2,231 views

Optimization and Parallelization Methods for the Design of Next-Generation Radio Networks  2,231 views

How to scale distributed deep learning?  2,230 views

Parallel Chen-Han (PCH) Algorithm for Discrete Geodesics  2,230 views

Solving Linear Recurrences on Hybrid GPU Accelerated Manycore Systems  2,229 views

HSApriori: High Speed Association Rule Mining using Apriori Based Algorithm for GPU  2,229 views

A Quantitative Study of Irregular Programs on GPUs  2,229 views

The Performance Analysis Based on Heterogeneous Parallel Processors for Anisotropic Diffusion Filters  2,229 views

An In-depth Performance Characterization of CPU- and GPU-based DNN Training on Modern Architectures  2,229 views

XSD: Accelerating MapReduce by Harnessing the GPU inside an SSD  2,228 views

Implementing Ultrasound Beamforming on the GPU using CUDA  2,228 views

Mersenne Twister Random Number Generation on FPGA, CPU and GPU  2,228 views

Designing a Unified Programming Model for Heterogeneous Machines  2,228 views

A Quantitative Performance Analysis Model for GPU Architectures  2,228 views

Large-Scale Compute-Intensive Analysis via a Combined In-Situ and Co-Scheduling Workflow Approach  2,228 views

Digital beamforming using a GPU  2,227 views

Interactive volumetric lighting simulating scattering and shadowing  2,227 views

Persistent RNNs: Stashing Recurrent Weights On-Chip  2,227 views

Computation of gray-level co-occurrence matrix based on CUDA and its optimization  2,227 views

Supporting Applications Involving Dynamic Data Structures and Irregular Memory Access on Emerging Parallel Platforms  2,227 views

Computing the distance between two finite element solutions defined on different 3D meshes on a GPU  2,227 views

Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU architectures  2,226 views

 

Brief statistics for this page

Titles: 100

Total views: 223854

 

Most viewed items:

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: