2402

Views of posts on hgpu.org

GPU-based Acceleration of Deep Convolutional Neural Networks on Mobile Platforms  3,202 views

Real-time Stereo Vision: Optimizing Semi-Global Matching  3,199 views

Comparative Performance Analysis of Intel Xeon Phi, GPU, and CPU  3,192 views

Softshell: Dynamic Scheduling on GPUs  3,176 views

Performance Evaluation of Deep Learning Tools in Docker Containers  3,176 views

Understanding Latency Hiding on GPUs  3,175 views

Brute-Force k-Nearest Neighbors Search on the GPU  3,175 views

Parallelization of BVH and BSP on the GPU  3,174 views

CUD@ASP: Experimenting with GPUs in ASP solving  3,173 views

Bi-directional Path Tracing on GPU  3,172 views

Massively Parallel Jacobian Computation  3,171 views

Acceleration of the MMFF94 routines within OpenBabel using Eigen and OpenCL  3,169 views

Stackless KD-Tree Traversal for High Performance GPU Ray Tracing  3,166 views

Progressive Clustering of Big Data with GPU Acceleration and Visualization  3,166 views

A Contour-Guided Deformable Image Registration Algorithm for Adaptive Radiotherapy  3,161 views

A hierarchically blocked Jacobi SVD algorithm for single and multiple graphics processing units  3,160 views

Fractal Video Compression in OpenCL: An Evaluation of CPUs, GPUs, and FPGAs as Acceleration Platforms  3,156 views

OpenOF: Framework for Sparse Non-linear Least Squares Optimization on a GPU  3,154 views

Fast Convolutional Nets With fbfft: A GPU Performance Evaluation  3,154 views

Numerical computations in Java with CUDA  3,152 views

Domain Specific Languages for High Performance Computing  3,151 views

An Implementation of Differential Evolution for Independent Tasks Scheduling on GPU  3,151 views

High speed cipher cracking: the case of Keeloq on CUDA  3,150 views

A Detailed GPU Cache Model Based on Reuse Distance Theory  3,146 views

Parallel training of Deep Neural Networks with Natural Gradient and Parameter Averaging  3,145 views

Ultrasound goes GPU: real-time simulation using CUDA  3,144 views

Advanced illumination techniques for GPU volume raycasting  3,143 views

GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding  3,142 views

On Benchmarking the Matrix Multiplication Algorithm using OpenMP, MPI and CUDA Programming Languages  3,142 views

Multicore Computing: Algorithms, Architectures, and Applications  3,141 views

PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks  3,141 views

CUDA by Example: An Introduction to General-Purpose GPU Programming  3,136 views

Performance comparison of FPGA, GPU and CPU in image processing  3,135 views

GPU Computing  3,133 views

Real-time Semi-Global Matching on the CPU  3,133 views

Noise Removal from Remote Sensed Images by NonLocal Means with OpenCL Algorithm  3,131 views

Water Surface Animation using Damped Wave Equation and CUDA Acceleration  3,130 views

Real-time and Realistic Simulation of Large-scale Deep Ocean Wave Foams Based on GPU  3,122 views

Design and Evaluation of Scalable Concurrent Queues for Many-Core Architectures  3,120 views

JPEG-GPU:: a GPGPU Implementation of JPEG Core Coding Systems  3,120 views

GPU Implementation of the Branch and Bound method for knapsack problems  3,118 views

Sample distribution shadow maps  3,116 views

Fast GPU-based image warping and inpainting for frame interpolation  3,116 views

GPU Pro 5: Advanced Rendering Techniques  3,113 views

SAGE: Self-Tuning Approximation for Graphics Engines  3,112 views

High performance finite difference PDE solvers on GPUs  3,111 views

New efficient integral algorithms for quantum chemistry  3,107 views

Functional Programming for High-Performance Computing on Heterogeneous Architectures  3,104 views

3-SAT on CUDA: Towards a massively parallel SAT solver  3,103 views

GPU implementation of JPEG XR  3,101 views

Recurrent Neural Networks Hardware Implementation on FPGA  3,100 views

Programming Massively Parallel Processors with CUDA (audio course)  3,090 views

GPU Computing for Machine Learning Algorithms  3,088 views

Using JavaScript and WebCL for Numerical Computations: A Comparative Study of Native and Web Technologies  3,087 views

An Optimized Large-Scale Hybrid DGEMM Design for CPUs and ATI GPUs  3,086 views

Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks  3,085 views

Bayesian Sparse Unsupervised Learning for Probit Models of Binary Data  3,084 views

Cudagrind: A Valgrind Extension for CUDA  3,081 views

Heterogeneous FTDT for Seismic Processing  3,081 views

Parallelization of the Algorithm WHAM with NVIDIA CUDA  3,080 views

Code Optimization on GPUs  3,079 views

Large Scale Plane Wave Pseudopotential Density Functional Theory Calculations on GPU Clusters  3,079 views

GPU Ray Tracing – Comparative Study of Ray-Triangle Intersection Algorithms  3,076 views

Evaluating different Java bindings for OpenCL  3,075 views

Implementation of Keccak hash function in Tree hashing mode on Nvidia GPU  3,071 views

Accelerating Deep Convolutional Neural Networks Using Specialized Hardware  3,070 views

Incomplete-LU and Cholesky Preconditioned Iterative Methods Using CUSPARSE and CUBLAS  3,069 views

QCDGPU: open-source package for Monte Carlo lattice simulations on OpenCL-compatible multi-GPU systems  3,069 views

Scientific Computing with Python on GPUs  3,068 views

The GPUVerify Method: a Tutorial Overview  3,065 views

Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption  3,062 views

Fast Implementation of Scale Invariant Feature Transform Based on CUDA  3,058 views

Blocks and Fuel: Frameworks for deep learning  3,056 views

.NET High Performance Computing  3,056 views

OpenCL for FPGAs: Prototyping a Compiler  3,054 views

DynaProg for Scala: A Scala DSL for Dynamic Programming on CPU and GPU  3,054 views

GPU-Quicksort: A practical Quicksort algorithm for graphics processors  3,049 views

Revisit Long Short-Term Memory: An Optimization Perspective  3,049 views

Benchmarking State-of-the-Art Deep Learning Software Tools  3,043 views

A New GPU-based Approach to the Shortest Path Problem  3,042 views

A Many-core Machine Model for Designing Algorithms with Minimum Parallelism Overheads  3,040 views

Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow  3,039 views

Melia: A MapReduce Framework on OpenCL-based FPGAs  3,038 views

Connectivity-Based Segmentation for GPU-Accelerated Mesh Decompression  3,037 views

GGAS: Global GPU Address Spaces for Efficient Communication in Heterogeneous Clusters  3,037 views

Arbitrary-Precision Arithmetics on the GPU  3,036 views

Hadoop Mapreduce OpenCL Plugin  3,035 views

Optimising OpenCL kernels for the ARM Mali-T600 GPUs  3,035 views

High-level GPU computing with jacket for MATLAB and C/C++  3,034 views

Efficient reconstruction of biological networks via transitive reduction on general purpose graphics processors  3,032 views

ReSYCLator: Transforming CUDA C++ source code into SYCL  3,031 views

Review and Comparative Study of Ray Traversal Algorithms on a Modern GPU Architecture  3,028 views

Lattice Boltzmann based PDE solver on the GPU  3,028 views

ShearLab 3D: Faithful Digital Shearlet Transforms based on Compactly Supported Shearlets  3,026 views

Accelerating the Gillespie Exact Stochastic Simulation Algorithm Using Hybrid Parallel Execution on Graphics Processing Units  3,026 views

2D/3D image registration on the GPU  3,021 views

Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes  3,019 views

Fast fluid dynamics simulation on the GPU  3,016 views

From MPI to MPI+OpenACC: Conversion of a legacy FORTRAN PCG solver for the spherical Laplace equation  3,014 views

Massively Parallel A* Search on a GPU  3,012 views

 

Brief statistics for this page

Titles: 100

Total views: 309970

 

Most viewed items:

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: