2402

Views of posts on hgpu.org

GPU Computing for Machine Learning Algorithms  3,469 views

A Survey of CPU-GPU Heterogeneous Computing Techniques  3,467 views

Heterogeneous FTDT for Seismic Processing  3,465 views

JPEG-GPU:: a GPGPU Implementation of JPEG Core Coding Systems  3,465 views

A Comparative Study of Game Tree Searching Methods  3,458 views

Scientific Computing with Python on GPUs  3,456 views

Massively Parallel A* Search on a GPU  3,454 views

Computing resultants on Graphics Processing Units: Towards GPU-accelerated computer algebra  3,453 views

Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing  3,453 views

From MPI to MPI+OpenACC: Conversion of a legacy FORTRAN PCG solver for the spherical Laplace equation  3,451 views

Accelerating video decoding using GPU  3,449 views

Efficient reconstruction of biological networks via transitive reduction on general purpose graphics processors  3,449 views

A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs  3,448 views

GPU Computing Gems: Emerald Edition  3,446 views

Arbitrary-Precision Arithmetics on the GPU  3,443 views

Hadoop Mapreduce OpenCL Plugin  3,441 views

HCudaBLAST: an implementation of BLAST on Hadoop and Cuda  3,437 views

Review and Comparative Study of Ray Traversal Algorithms on a Modern GPU Architecture  3,437 views

Revisit Long Short-Term Memory: An Optimization Perspective  3,435 views

OpenCL Fast Fourier Transform  3,435 views

Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow  3,434 views

Large Integer Arithmetic in GPU for Cryptography  3,431 views

Blocks and Fuel: Frameworks for deep learning  3,430 views

Face Recognition: A Tutorial on Computational Aspects  3,428 views

How to Benefit from AMD, Intel and Nvidia Accelerator Technologies in Scilab  3,428 views

Rootbeer: Seamlessly using GPUs from Java  3,425 views

FIR filtering and AES encryption with OpenCL 2.0  3,422 views

Accelerating Deep Convolutional Neural Networks Using Specialized Hardware  3,420 views

OpenCL for FPGAs: Prototyping a Compiler  3,418 views

Training Neural Networks Without Gradients: A Scalable ADMM Approach  3,412 views

Evaluating different Java bindings for OpenCL  3,412 views

Optimising OpenCL kernels for the ARM Mali-T600 GPUs  3,411 views

ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time Scale  3,408 views

A New GPU-based Approach to the Shortest Path Problem  3,407 views

A GPU-based Approximate SVD Algorithm  3,406 views

Distributed OpenCL: a platform for distributed, heterogeneous computing for domain scientists  3,404 views

Weighted Residuals for Very Deep Networks  3,404 views

A Many-core Machine Model for Designing Algorithms with Minimum Parallelism Overheads  3,404 views

Efficient Implementation of Bi-directional Path Tracer on GPU  3,395 views

A compiler toolkit for array-based languages targeting CPU/GPU hybrid systems  3,393 views

Forecasting high frequency financial time series using parallel FFN with CUDA and ZeroMQ  3,391 views

Fast BVH Construction on GPUs  3,388 views

SODECL: An Open Source Library for Calculating Multiple Orbits of a System of Stochastic Differential Equations in Parallel  3,387 views

Ocean wave simulation in real-time using GPU  3,385 views

Accelerating MapReduce on a coupled CPU-GPU architecture  3,381 views

A (ir)regularity-aware task scheduler for heterogeneous platforms  3,381 views

Parallel face Detection and Recognition on GPU  3,380 views

GPU-SD and DPD Parallelization for Gromacs tools for molecular dynamics simulations  3,379 views

GPU Scripting and Code Generation with PyCUDA  3,379 views

A parallel algorithm for implicit depletant simulations  3,378 views

Using GPU-based Computing To Accelerate Finite Element Problems  3,378 views

Sparse Matrix Matrix Multiplication on Hybrid CPU+GPU Platforms  3,375 views

Automatic Translation of CUDA to OpenCL and Comparison of Performance Optimizations on GPUs  3,375 views

An Introduction to the OpenCL Programming Model  3,375 views

Bayesian Sparse Unsupervised Learning for Probit Models of Binary Data  3,374 views

Accelerating the Gillespie Exact Stochastic Simulation Algorithm Using Hybrid Parallel Execution on Graphics Processing Units  3,374 views

A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures  3,373 views

GPU-Acceleration of Linear Algebra using OpenCL  3,373 views

A Single (Unified) Shader GPU Microarchitecture for Embedded Systems  3,371 views

Batched Shift Reduce Parsing with Lists of Vectors on CUDA  3,369 views

Real-Time Incompressible Fluid Simulation on the GPU  3,369 views

Real time mitigation of atmospheric turbulence in long distance imaging using the lucky region fusion algorithm with FPGA and GPU hardware acceleration  3,368 views

Performance Drawbacks for Matrix Multiplication using Set Associative Cache in GPU devices  3,366 views

TABLA: A Unified Template-based Framework for Accelerating Statistical Machine Learning  3,365 views

Shredder: GPU-Accelerated Incremental Storage and Computation  3,360 views

Deep Dynamic Neural Networks for Gesture Segmentation and Recognition  3,357 views

Theano: A Python framework for fast computation of mathematical expressions  3,355 views

GPU-based 3D Wavelet Transform  3,351 views

vSMC: Parallel Sequential Monte Carlo in C++  3,349 views

A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems  3,347 views

C++ AMP: Accelerated Massive Parallelism with Microsoft Visual C++  3,347 views

Efficient GPGPU-based parallel packet classification  3,344 views

OpenMP for Accelerators  3,341 views

Importance of Explicit Vectorization for CPU and GPU Software Performance  3,340 views

gem5-gpu: A Heterogeneous CPU-GPU Simulator  3,338 views

A Complete Descritpion of the UnPython and Jit4GPU Framework  3,336 views

eccCL: parallelized GPU implementation of Ensemble Classifier Chains  3,336 views

Performance Portability Evaluation for OpenACC on Intel Knights Corner and Nvidia Kepler  3,335 views

Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor  3,333 views

Real-time Flame Rendering with GPU and CUDA  3,332 views

Genetically Improved CUDA kernels for StereoCamera  3,331 views

FFT and Convolution Performance in Image Filtering on GPU  3,330 views

Exposure Render: An Interactive Photo-Realistic Volume Rendering Framework  3,329 views

Monte-Carlo Black-Scholes Implementation using OpenCL Standard  3,328 views

High Performance Programming for Soft Computing  3,328 views

Massive Parallel Implementation of ODE Solvers  3,324 views

Implementation of a Lattice–Boltzmann method for numerical fluid mechanics using the nVIDIA CUDA technology  3,322 views

Template Library for Multi-GPU Pseudorandom Number Recursion-based Generators  3,322 views

The MOSIX Cluster Operating System for High-Performance Computing on Linux Clusters, Multi-Clusters, GPU Clusters and Clouds  3,321 views

A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data  3,319 views

Fast 2-D Ultrasound Strain Imaging: The Benefits of Using a GPU  3,318 views

Parallel Hashing, Compression and Encryption with OpenCL under OS X  3,317 views

Development of a GPU-accelerated MIKE 21 Solver for Water Wave Dynamics  3,316 views

Sparselet Models for Efficient Multiclass Object Detection  3,316 views

OpenCL-Z Android Released on Google Play  3,314 views

TensorFlow: A system for large-scale machine learning  3,313 views

Up to 700k GPU cores, Kepler, and the Exascale future for simulations of star clusters around black holes  3,312 views

GPU Based Acceleration of Telegraph Equation  3,312 views

PROST: Parallel robust online simple tracking  3,309 views

Fast Hair Simulation and Rendering Using CUDA and OpenGL  3,309 views

 

Brief statistics for this page

Titles: 100

Total views: 338228

 

Most viewed items:

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org