Views of posts on hgpu.org
OpenCL framework for a CPU, GPU, and FPGA Platform 2,364 views
Parallel Implementation of Travelling Salesman Problem using Ant Colony Optimization 2,363 views
UNICORN: A Bulk Synchronous Programming Model, Framework and Runtime for Hybrid CPU-GPU Clusters 2,363 views
Deep Architectures for Neural Machine Translation 2,363 views
CUDA-based Signed Distance Field Calculation for Adaptive Grids 2,363 views
PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation 2,363 views
Aristotle: A Performance Impact Indicator for the OpenCL Kernels Using Local Memory 2,363 views
Interactive Ray-tracing Based on OptiX to Visualize Signed Distance Fields 2,363 views
GPU-accelerated Database Systems: Survey and Open Challenges 2,361 views
Duality based optical flow algorithms with applications 2,361 views
A Domain Specific Language for Performance Portable Molecular Dynamics Algorithms 2,360 views
String Algorithm on GPGPU 2,360 views
Monte Carlo simulations on Graphics Processing Units 2,360 views
Enhanced Parallel NegaMax Tree Search Algorithm on GPU 2,360 views
Scheduling a Parallel Sparse Direct Solver to Multiple GPUs 2,360 views
OpenCL based machine learning labeling of biomedical datasets 2,360 views
A Performance Comparison of Different Graphics Processing Units Running Direct N-Body Simulations 2,360 views
LookNN: Neural Network with No Multiplication 2,358 views
Bamboo: Automatic Translation of MPI Source into a Latency-Tolerant Form 2,358 views
Energy Efficiency Studies of Mont Blanc Applications 2,357 views
Singular value decomposition on GPU using CUDA 2,357 views
A GPU based real-time video compression method for video conferencing 2,357 views
Practical Symbolic Race Checking of GPU Programs 2,357 views
Processing Big Data in Main Memory and on GPU 2,356 views
One OpenCL to Rule Them All? 2,356 views
Gdev: First-Class GPU Resource Management in the Operating System 2,355 views
A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-volatile On-chip Caches 2,354 views
Spatial Join with R-Tree on Graphics Processing Units 2,354 views
Future of GPGPU Micro-Architectural Parameters 2,354 views
March of the Froblins: simulation and rendering massive crowds of intelligent and detailed creatures on GPU 2,353 views
DenseCut: Densely Connected CRFs for Realtime GrabCut 2,353 views
Scaling LAPACK panel operations using parallel cache assignment 2,352 views
A Spiking Neural P system simulator based on CUDA 2,352 views
fairseq: A Fast, Extensible Toolkit for Sequence Modeling 2,352 views
Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning 2,351 views
A GPU Tool for Efficient, Accurate, and Realistic Simulation of Cone Beam CT Projections 2,351 views
The Q Continuum Simulation: Harnessing the Power of GPU Accelerated Supercomputers 2,351 views
NVIDIA Tensor Core Programmability, Performance & Precision 2,351 views
Review: Kd-tree Traversal Algorithms for Ray Tracing 2,350 views
AccFFT: A library for distributed-memory FFT on CPU and GPU architectures 2,350 views
Task scheduling in hybrid CPU-GPU systems 2,349 views
Binaural Simulations Using Audio Rate FDTD Schemes and CUDA 2,349 views
NUPAR: A Benchmark Suite for Modern GPU Architectures 2,349 views
A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA 2,349 views
Accelerating GPU Implementation of Contourlet Transform 2,348 views
Feature Extraction and Visualization from Higher-Order CFD Data 2,348 views
Parallel Support Vector Machines in Practice 2,348 views
DistCL: A Framework for the Distributed Execution of OpenCL Kernels 2,348 views
SqueezCL: Squeezing OpenCL Kernels for Approximate Computing on Contemporary GPUs 2,347 views
Acceleration of Monte-Carlo Molecular Simulations on Hybrid Computing Architectures 2,347 views
A numerical tour of wave propagation 2,347 views
Massively parallel Monte Carlo for many-particle simulations on GPUs 2,347 views
Analyzing CUDA workloads using a detailed GPU simulator 2,347 views
Parallel one-versus-rest SVM training on the GPU 2,347 views
Application of Deep-Learning to Compiler-Based Graphs 2,347 views
Fine-Grained Synchronizations and Dataflow Programming on GPUs 2,347 views
GPU Implementation of Real-Time Biologically Inspired Face Detection using CUDA 2,346 views
Towards Automatic Transformation of Legacy Scientific Code into OpenCL for Optimal Performance on FPGAs 2,345 views
GpuCV: A GPU-Accelerated Framework for Image Processing and Computer Vision 2,344 views
An efficient KNN algorithm implemented on FPGA based heterogeneous computing system using OpenCL 2,343 views
Parallel Banding Algorithm to compute exact distance transform with the GPU 2,343 views
Implementation of Motion Estimation Based on Heterogeneous Parallel Computing System with OpenCL 2,343 views
3D-color video camera 2,342 views
A GPU accelerated spring mass system for surgical simulation 2,342 views
Face Detection with Improved Local Binary Patterns in CUDA 2,342 views
The FFT on a GPU 2,342 views
Energy-efficient Computing on Distributed GPUs using Dynamic Parallelism and GPU-controlled Communication 2,341 views
Grover: Looking for Performance Improvement by Disabling Local Memory Usage in OpenCL Kernels 2,341 views
Fast and scalable list ranking on the GPU 2,340 views
A collision detection algorithm using adaptive particle sensor 2,340 views
A Parallel Algorithm for Calculation of Large Determinants with High Accuracy for GPUs and MPI clusters 2,340 views
Pseudo Random Number Generators on Graphics Processing Units, with Applications in Finance 2,339 views
Discrete Shearlet Transform on GPU with Applications in Anomaly Detection and Denoising 2,339 views
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning 2,339 views
Graphics Programming on the Web WebCL Course Notes 2,339 views
Exascale Deep Learning for Climate Analytics 2,339 views
Computation on GPU of Eigenvalues and Eigenvectors of a Large Number of Small Hermitian Matrices 2,338 views
Design of a Hybrid Memory System for General-Purpose Graphics Processing Units 2,337 views
Array Program Transformation with Loo.py by Example: High-Order Finite Elements 2,337 views
gSLICr: SLIC superpixels at over 250Hz 2,337 views
The Bones Source-to-Source Compiler Manual 2,336 views
Parallelized Seeded Region Growing using CUDA 2,336 views
An Optimal Offline Permutation Algorithm on the Hierarchical Memory Machine, with the GPU implementation 2,336 views
Hybrid Parallel Streamline Extraction Combining MPI and OpenCL 2,336 views
Facial Recognition Using Neural Networks over GPGPU 2,336 views
A comparative study of GPU programming models and architectures using neural networks 2,335 views
Fast hydrodynamics on heterogenous many-core hardware 2,335 views
PErasure: a Parallel Cauchy Reed-Solomon Coding Library for GPUs 2,335 views
Accelerating Exact Similarity Search on CPU-GPU Systems 2,334 views
An Execution Model for OpenCL 2.0 2,334 views
Running Financial Risk Management Applications on FPGA in the Amazon Cloud 2,334 views
Togpu: Automatic Source Transformation from C++ to CUDA using Clang/LLVM 2,333 views
Real-Time Surface Extraction and Visualization of Medical Images using OpenCL and GPUs 2,333 views
Titles: 100
Total views: 234844
- Programming - 186,133 views
- Login - 164,571 views
- User dashboard - 91,322 views
- Paper titles list - 71,400 views
- Add new event - 64,819 views
- Add new post - 59,626 views
- Register - 49,323 views
- Statistics - 37,182 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,194 views
- Books on OpenCL and CUDA - 28,901 views