Views of posts on hgpu.org
Implementing density functional theory (DFT) methods on many-core GPGPU accelerators 3,035 views
Accelerating the ANSYS Direct Sparse Solver with GPUs 3,035 views
libmolgrid: GPU Accelerated Molecular Gridding for Deep Learning Applications 3,035 views
GPU Accelerated Greedy Algorithms for Compressed Sensing 3,035 views
Hybrid CPU-GPU Implementation of Tracking-Learning-Detection Algorithm 3,033 views
A Parallel Algorithm of PCA-SIFT Based on CUDA 3,032 views
Programming Frameworks for Distributed Smartphone Computing 3,030 views
Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL? 3,030 views
Support for Parallel Scan in OpenMP 3,030 views
Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels 3,030 views
Legolizer: A Real-Time System for Modeling and Rendering LEGO Representations of Boundary Models 3,027 views
KERNELGEN – A Toolchain for Automatic GPU-centric Applications Porting 3,026 views
Fractal Based Method on Hardware Acceleration for Natural Environments 3,024 views
A Case for Work-stealing on FPGAs with OpenCL Atomics 3,023 views
SWM: Simplified Wu-Manber for GPU-based Deep Packet Inspection 3,022 views
GPU Accelerated Face Detection (thesis) 3,021 views
3D tumor localization through real-time volumetric x-ray imaging for lung cancer radiotherapy 3,020 views
Towards Interactive Visual Exploration of Parallel Programs using a Domain-specific Language 3,020 views
GPU Virtualization 3,020 views
Introducing CURRENNT: The Munich Open-Source CUDA RecurREnt Neural Network Toolkit 3,020 views
Swendsen-Wang Multi-Cluster Algorithm for the 2D/3D Ising Model on Xeon Phi and GPU 3,019 views
The Accelerator Wall: Limits of Chip Specialization 3,019 views
A Hybrid Approach to Parallel Connected Component Labeling Using CUDA 3,019 views
Hybrid CPU-GPU Pipeline Framework 3,019 views
A Data-Parallel Graphics Pipeline Implemented in OpenCL 3,018 views
On the Fly Porn Video Blocking Using Distributed Multi-GPU and Data Mining Approach 3,017 views
GPU-Based Airway Tree Segmentation and Centerline Extraction 3,015 views
High Performance Histograms on SIMT and SIMD Architectures 3,015 views
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication 3,015 views
Hardware Transactional Memory for GPU Architectures 3,015 views
Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths 3,014 views
Swarm-NG: a CUDA Library for Parallel n-body Integrations with focus on Simulations of Planetary Systems 3,014 views
A Predictive Model for Solving Small Linear Algebra Problems in GPU Registers 3,014 views
The GPU Computing Era 3,013 views
Molecular dynamics simulations through GPU video games technologies 3,013 views
A design case study: CPU vs. GPGPU vs. FPGA 3,013 views
Face Recognition Using OpenCL 3,012 views
Sailfish: a flexible multi-GPU implementation of the lattice Boltzmann method 3,012 views
Computing Strongly Connected Components with CUDA 3,011 views
Rubus: A compiler for seamless and extensible parallelism 3,011 views
Multi-Scale, Multi-Level, Heterogeneous Features Extraction and Classification of Volumetric Medical Images 3,009 views
Theano-based Large-Scale Visual Recognition with Multiple GPUs 3,009 views
FastTree: A Hardware KD-Tree Construction Acceleration Engine for Real-Time Ray Tracing 3,009 views
Design and Development of an Efficient H. 264 Video Encoder for CPU/GPU using OpenCL 3,008 views
A short guide to CUDA C: For physicists with multi-core graphics cards 3,007 views
OpenGL SuperBible: Comprehensive Tutorial and Reference (5th Edition) 3,007 views
Demystifying GPU microarchitecture through microbenchmarking 3,006 views
GPU-based ultrafast IMRT plan optimization 3,005 views
Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer 3,005 views
Matrix Multiplication with CUDA – A basic introduction to the CUDA programming model 3,005 views
Anisotropic Kuwahara Filtering on the GPU 3,005 views
Parallelization of the Generalized Hough Transform on GPU 3,004 views
Adapting the GA Approach to Solve Traveling Salesman Problems on CUDA Architecture 3,003 views
Fast Burrows Wheeler Compression Using CPU and GPU 3,003 views
A Comparison of Modern GPU and CPU Architectures: And the Common Convergence of Both 3,002 views
3D GPU Architecture using Cache Stacking: Performance, Cost, Power and Thermal analysis 3,002 views
Transparent CPU-GPU Collaboration for Data-Parallel Kernels on Heterogeneous Systems 3,001 views
Machine Learning from Streaming Data in Heterogeneous Computing Environments 3,000 views
Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning 3,000 views
Improved Finite Difference Schemes for a 3-D Viscothermal Wave Equation on a GPU 2,999 views
GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model 2,998 views
GHOST: GPGPU-Offloaded High Performance Storage I/O Deduplication for Primary Storage System 2,996 views
Optimizing Performance of Recurrent Neural Networks on GPUs 2,996 views
cf4ocl: a C framework for OpenCL 2,996 views
Increasing GPU Throughput using Kernel Interleaved Thread Block Scheduling 2,995 views
Finding Longest Common Subsequences by GPU-Based Parallel Ant Colony Optimization 2,994 views
GPU Array Access Auto-Tuning 2,992 views
A code motion technique for accelerating general-purpose computation on the GPU 2,992 views
Real-time Image Processing on Low Cost Embedded Computers 2,991 views
The Comparisons of OpenCL and OpenMP Computing Paradigm 2,987 views
On Vectorization of Deep Convolutional Neural Networks for Vision Tasks 2,987 views
Using GPUs for Machine Learning Algorithms 2,985 views
Implementation of the SYCL Heterogeneous Computing Library 2,984 views
Molecular dynamics recipes for genome research 2,983 views
High productivity multi-device exploitation with the Heterogeneous Programming Library 2,983 views
Scalable Kernel Fusion for Memory-Bound GPU Applications 2,983 views
Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning 2,983 views
Data-Parallel Octrees for Surface Reconstruction 2,982 views
Implementation of digital down converter in GPU 2,980 views
Multi-Threaded Automatic Integration Using OpenMP and CUDA 2,980 views
XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures 2,980 views
Pseudorandom Numbers Generation for Monte Carlo Simulations on GPUs: OpenCL Approach 2,979 views
State Lattice-based Motion Planning for Autonomous On-Road Driving 2,979 views
FastMag: Fast micromagnetic simulator for complex magnetic structures 2,979 views
Dynamic Memory Allocation for OpenCL 2,978 views
Achieving TeraCUPS on Longest Common Subsequence Problem using GPGPUs 2,977 views
Hybrid GPU-Based Single- and Double-Bounce SAR Simulation 2,977 views
Sparser, Better, Faster GPU Parsing 2,976 views
Converting Data to Task-Parallelism by Rewrites 2,976 views
Parallel Voronoi Diagram computation on scaled distance planes using CUDA 2,975 views
GeNN: a code generation framework for accelerated brain simulations 2,974 views
clMAGMA: High Performance Dense Linear Algebra with OpenCL 2,974 views
Best Practice Guide Intel Xeon Phi v2.0 2,974 views
OpenNMT: Open-Source Toolkit for Neural Machine Translation 2,972 views
Titles: 100
Total views: 300465
- Programming - 186,232 views
- Login - 172,160 views
- User dashboard - 98,601 views
- Paper titles list - 92,957 views
- Add new event - 69,211 views
- Add new post - 62,810 views
- Register - 53,112 views
- Statistics - 44,259 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,523 views
- Books on OpenCL and CUDA - 31,175 views