2402

Views of posts on hgpu.org

GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training  2,693 views

A GPU-Based Wide-Band Radio Spectrometer  2,692 views

Unsupervised Asset Cluster Analysis Implemented with Parallel Genetic Algorithms on the NVIDIA CUDA Platform  2,689 views

GPU Accelerated Greedy Algorithms for Compressed Sensing  2,688 views

Multi-GPU Acceleration of Black-Scholes Equation based Option Pricing  2,688 views

GPU Implementation of the Particle Filter  2,685 views

Finite Pointset Method for 2D Dam-Break Problem with GPU-Acceleration  2,685 views

ZNN – A Fast and Scalable Algorithm for Training 3D Convolutional Networks on Multi-Core and Many-Core Shared Memory Machines  2,685 views

G-SNPM – A GPU-based SNP mapping tool  2,684 views

A stand-alone Finite Difference Time Domain (FDTD) simulation for Integrated Optoelectronics Laboratory  2,683 views

3DES ECB Optimized for Massively Parallel CUDA GPU Architecture  2,682 views

High Throughput Low Latency LDPC Decoding on GPU for SDR Systems  2,681 views

The Accelerator Wall: Limits of Chip Specialization  2,680 views

Evolution of thread-level parallelism in desktop applications  2,680 views

FPGA and GPU implementation of large scale SpMV  2,679 views

Speeding up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves  2,679 views

Precomputed Atmospheric Scattering  2,678 views

Image Denoising Using Wavelet Transform and CUDA  2,678 views

Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths  2,674 views

Hybrid CPU-GPU Implementation of Tracking-Learning-Detection Algorithm  2,673 views

cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on a GPU  2,672 views

Efficient softmax approximation for GPUs  2,671 views

Implementing density functional theory (DFT) methods on many-core GPGPU accelerators  2,670 views

Ray Tracing on GPUs  2,670 views

Local Laplacian Filters: Edge-aware Image Processing with a Laplacian Pyramid  2,670 views

3D Recursive Gaussian IIR on GPU and FPGAs: A Case Study for Accelerating Bandwidth-Bounded Applications  2,669 views

Multi-Scale, Multi-Level, Heterogeneous Features Extraction and Classification of Volumetric Medical Images  2,669 views

Parallel Implementations of the Cholesky Decomposition on CPUs and GPUs  2,667 views

A biomolecular electrostatics solver using Python, GPUs and boundary elements that can handle solvent-filled cavities and Stern layers  2,666 views

Fast and Flexible: Parallel Packet Processing with GPUs and Click  2,665 views

CUDA Based CAMshift Algorithm for Object Tracking Systems  2,665 views

Swendsen-Wang Multi-Cluster Algorithm for the 2D/3D Ising Model on Xeon Phi and GPU  2,665 views

Accelerating IISPH: A Parallel GPGPU Solution Using CUDA  2,661 views

VHF SAR image formation implemented on a GPU  2,660 views

Convex Clustering: An Attractive Alternative to Hierarchical Clustering  2,659 views

An Introduction to OpenCL C++  2,659 views

Auto-tunable GPU BLAS  2,658 views

Poseidon: A System Architecture for Efficient GPU-based Deep Learning on Multiple Machines  2,658 views

String Matching on a Multicore GPU Using CUDA  2,657 views

GPU Sparse Matrix Multiplication with CUDA  2,656 views

Data Layout Oriented Compilation Techniques in Vectorization for Multi-/Many-cores  2,655 views

Computing Strongly Connected Components with CUDA  2,654 views

A Data-Parallel Graphics Pipeline Implemented in OpenCL  2,652 views

Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations  2,652 views

Cloth Simulation on the GPU  2,652 views

KERNELGEN – A Toolchain for Automatic GPU-centric Applications Porting  2,652 views

Optimization principles and application performance evaluation of a multithreaded GPU using CUDA  2,651 views

GPGPU Performance Estimation with Core and Memory Frequency Scaling  2,651 views

Fast Mersenne prime testing on the GPU  2,651 views

A Performance Comparison of CUDA and OpenCL  2,649 views

Theano-based Large-Scale Visual Recognition with Multiple GPUs  2,649 views

Implementation of the genetic algorithm by means of CUDA technology involved in travelling salesman problem  2,649 views

Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning  2,647 views

Performance Analysis of Parallel Sorting Algorithms using GPU Computing  2,645 views

Parallelizing flow-accumulation calculations on graphics processing units – From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm  2,645 views

Real-time Image Processing on Low Cost Embedded Computers  2,644 views

A Parallel Algorithm of PCA-SIFT Based on CUDA  2,644 views

Deep Feature-based Face Detection on Mobile Devices  2,644 views

Warp-Level Divergence in GPUs: Characterization, Impact, and Mitigation  2,643 views

A GEMM interface and implementation on NVIDIA GPUs for multiple small matrices  2,643 views

AES and DES Encryption with GPU  2,643 views

GPUGI: Global Illumination Effects on the GPU  2,643 views

OpenCL Performance Prediction using Architecture-Independent Features  2,642 views

Increasing GPU Throughput using Kernel Interleaved Thread Block Scheduling  2,641 views

DeepSpeech: Scaling up end-to-end speech recognition  2,641 views

The Comparisons of OpenCL and OpenMP Computing Paradigm  2,641 views

Efficient Parallel Methods for Deep Reinforcement Learning  2,640 views

A framework to implement a multifrontal scheme on GPU architectures with OpenCL  2,637 views

GPU Virtualization  2,637 views

Deep convolutional networks for pancreas segmentation in CT imaging  2,637 views

Fractal Based Method on Hardware Acceleration for Natural Environments  2,636 views

Dynamic Memory Allocation for OpenCL  2,636 views

fastHOG – a real-time GPU implementation of HOG  2,636 views

LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks  2,635 views

Molecular dynamics simulations through GPU video games technologies  2,634 views

MCS 572: Introduction to Supercomputing  2,634 views

Best Practice Guide Intel Xeon Phi v2.0  2,633 views

A Predictive Model for Solving Small Linear Algebra Problems in GPU Registers  2,633 views

Programming Frameworks for Distributed Smartphone Computing  2,633 views

Parallel Execution of AES-CTR Algorithm Using Extended Block Size  2,632 views

FastSpMM: An Efficient Library for Sparse Matrix Matrix Product on GPUs  2,631 views

GPU Accelerated Face Detection (thesis)  2,629 views

Hardware Transactional Memory for GPU Architectures  2,629 views

On the Fly Porn Video Blocking Using Distributed Multi-GPU and Data Mining Approach  2,629 views

Energy Consumption of Algorithms for Solving the Compressible Navier-Stokes Equations on CPU’s, GPU’s and KNL’s  2,629 views

GPU-Based Airway Tree Segmentation and Centerline Extraction  2,629 views

Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels  2,628 views

Face Recognition Using OpenCL  2,627 views

Achieving TeraCUPS on Longest Common Subsequence Problem using GPGPUs  2,627 views

GPUburn: A System to Test and Mitigate GPU Hardware Failures  2,626 views

Parallelization of the Generalized Hough Transform on GPU  2,625 views

Numerical Computations with GPUs  2,624 views

Fast Burrows Wheeler Compression Using CPU and GPU  2,623 views

Synergia CUDA: GPU-accelerated accelerator modeling package  2,623 views

A short guide to CUDA C: For physicists with multi-core graphics cards  2,622 views

XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures  2,622 views

On Vectorization of Deep Convolutional Neural Networks for Vision Tasks  2,622 views

OpenCL C++  2,621 views

A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads  2,621 views

libmolgrid: GPU Accelerated Molecular Gridding for Deep Learning Applications  2,619 views

 

Brief statistics for this page

Titles: 100

Total views: 265165

 

Most viewed items:

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: