Views of posts on hgpu.org
Multi-Tasking Scheduling for Heterogeneous Systems 3,644 views
An Optimized Large-Scale Hybrid DGEMM Design for CPUs and ATI GPUs 3,641 views
Parallelization of BVH and BSP on the GPU 3,636 views
Histogram Computations on GPUs Kernel using Global and Shared Memory Atomics 3,635 views
Ultrasound goes GPU: real-time simulation using CUDA 3,635 views
Multicore Computing: Algorithms, Architectures, and Applications 3,634 views
Softshell: Dynamic Scheduling on GPUs 3,633 views
GPU Accelerated Nonlinear Optimization in Radio Interferometric Calibration 3,631 views
Progressive Clustering of Big Data with GPU Acceleration and Visualization 3,626 views
Triangular mesh simplification on the GPU 3,622 views
Memory transfer optimization for a lattice Boltzmann solver on Kepler architecture nVidia GPUs 3,617 views
Melia: A MapReduce Framework on OpenCL-based FPGAs 3,612 views
Acceleration of the MMFF94 routines within OpenBabel using Eigen and OpenCL 3,612 views
Medical imaging using CUDA 3,612 views
Real-time Stereo Vision: Optimizing Semi-Global Matching 3,611 views
Deep learning review and its applications 3,607 views
Ray Tracing in the Cloud using MapReduce 3,602 views
Using JavaScript and WebCL for Numerical Computations: A Comparative Study of Native and Web Technologies 3,601 views
Functional Programming for High-Performance Computing on Heterogeneous Architectures 3,601 views
GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding 3,600 views
GPU-based Acceleration of Deep Convolutional Neural Networks on Mobile Platforms 3,597 views
Fast 3D Graphics Rendering Technique with CUDA Parallel Processing 3,595 views
GenBase: A Complex Analytics Genomics Benchmark 3,594 views
Fast Convolutional Nets With fbfft: A GPU Performance Evaluation 3,593 views
Real-time Semi-Global Matching on the CPU 3,592 views
A Contour-Guided Deformable Image Registration Algorithm for Adaptive Radiotherapy 3,591 views
Water Surface Animation using Damped Wave Equation and CUDA Acceleration 3,588 views
On Benchmarking the Matrix Multiplication Algorithm using OpenMP, MPI and CUDA Programming Languages 3,587 views
Advanced illumination techniques for GPU volume raycasting 3,585 views
Performance Evaluation of Deep Learning Tools in Docker Containers 3,581 views
3-SAT on CUDA: Towards a massively parallel SAT solver 3,579 views
An Implementation of Differential Evolution for Independent Tasks Scheduling on GPU 3,578 views
Real-time and Realistic Simulation of Large-scale Deep Ocean Wave Foams Based on GPU 3,578 views
High speed cipher cracking: the case of Keeloq on CUDA 3,577 views
Massively Parallel Jacobian Computation 3,576 views
Sample distribution shadow maps 3,573 views
GPU implementation of JPEG XR 3,573 views
Noise Removal from Remote Sensed Images by NonLocal Means with OpenCL Algorithm 3,573 views
GPU Accelerated Keccak (SHA3) Algorithm 3,571 views
GPU-Quicksort: A practical Quicksort algorithm for graphics processors 3,571 views
The GPUVerify Method: a Tutorial Overview 3,571 views
Are Very Deep Neural Networks Feasible on Mobile Devices? 3,571 views
Comparative Performance Analysis of Intel Xeon Phi, GPU, and CPU 3,571 views
Bi-directional Path Tracing on GPU 3,569 views
Bitcoin and The Age of Bespoke Silicon 3,566 views
Comparison of Fragmentation/Dispersion Models for Asteroid Nuclear Disruption Mission Design 3,565 views
QCDGPU: open-source package for Monte Carlo lattice simulations on OpenCL-compatible multi-GPU systems 3,565 views
A Detailed GPU Cache Model Based on Reuse Distance Theory 3,563 views
Parallelization of the Algorithm WHAM with NVIDIA CUDA 3,562 views
Parallel GPU-accelerated Recursion-based Generators of Pseudorandom Numbers 3,562 views
Domain Specific Languages for High Performance Computing 3,558 views
Fast GPU-based image warping and inpainting for frame interpolation 3,558 views
GPU Pro 5: Advanced Rendering Techniques 3,557 views
Fast fluid dynamics simulation on the GPU 3,549 views
Large scale parallel state space search utilizing graphics processing units and solid state disks 3,544 views
An implementation of level set based topology optimization using GPU 3,543 views
OpenOF: Framework for Sparse Non-linear Least Squares Optimization on a GPU 3,541 views
Design and Evaluation of Scalable Concurrent Queues for Many-Core Architectures 3,539 views
Lattice Boltzmann based PDE solver on the GPU 3,539 views
Numerical computations in Java with CUDA 3,536 views
.NET High Performance Computing 3,535 views
Fractal Video Compression in OpenCL: An Evaluation of CPUs, GPUs, and FPGAs as Acceleration Platforms 3,534 views
Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing 3,534 views
High performance finite difference PDE solvers on GPUs 3,522 views
Parallel training of Deep Neural Networks with Natural Gradient and Parameter Averaging 3,521 views
New efficient integral algorithms for quantum chemistry 3,517 views
Connected component labeling on a 2D grid using CUDA 3,513 views
Large Scale Plane Wave Pseudopotential Density Functional Theory Calculations on GPU Clusters 3,512 views
RTX Beyond Ray Tracing: Exploring the Use of Hardware Ray Tracing Cores for Tet-Mesh Point Location 3,510 views
Embedding OpenCL in GHC Haskell 3,507 views
2D/3D image registration on the GPU 3,505 views
SAGE: Self-Tuning Approximation for Graphics Engines 3,505 views
Fast Implementation of Scale Invariant Feature Transform Based on CUDA 3,504 views
Cudagrind: A Valgrind Extension for CUDA 3,503 views
Implementing a Photorealistic Rendering System using GLSL 3,502 views
GPU Implementation of the Branch and Bound method for knapsack problems 3,501 views
GPU Ray Tracing – Comparative Study of Ray-Triangle Intersection Algorithms 3,497 views
A hierarchically blocked Jacobi SVD algorithm for single and multiple graphics processing units 3,497 views
GPGPU Performance and Power Estimation Using Machine Learning 3,495 views
An open source finite-difference time-domain solver for room acoustics using graphics processing units 3,494 views
Code Optimization on GPUs 3,493 views
ShearLab 3D: Faithful Digital Shearlet Transforms based on Compactly Supported Shearlets 3,493 views
Connectivity-Based Segmentation for GPU-Accelerated Mesh Decompression 3,492 views
High-level GPU computing with jacket for MATLAB and C/C++ 3,487 views
Monte Carlo integration on GPU 3,482 views
ReSYCLator: Transforming CUDA C++ source code into SYCL 3,482 views
DynaProg for Scala: A Scala DSL for Dynamic Programming on CPU and GPU 3,480 views
DeepMetabolism: A Deep Learning System to Predict Phenotype from Genome Sequencing 3,480 views
PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks 3,479 views
Langevin dynamics simulations of biomolecules on graphics processors 3,476 views
Recurrent Neural Networks Hardware Implementation on FPGA 3,475 views
Incomplete-LU and Cholesky Preconditioned Iterative Methods Using CUSPARSE and CUBLAS 3,475 views
Towards GPU-Accelerated Large-Scale Graph Processing in the Cloud 3,474 views
GGAS: Global GPU Address Spaces for Efficient Communication in Heterogeneous Clusters 3,473 views
Benchmarking State-of-the-Art Deep Learning Software Tools 3,473 views
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU 3,470 views
Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks 3,468 views
GPU Computing for Machine Learning Algorithms 3,467 views
Titles: 100
Total views: 355058
- Programming - 186,231 views
- Login - 172,134 views
- User dashboard - 98,584 views
- Paper titles list - 92,743 views
- Add new event - 69,206 views
- Add new post - 62,798 views
- Register - 53,100 views
- Statistics - 44,246 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,520 views
- Books on OpenCL and CUDA - 31,164 views