Views of posts on hgpu.org
Modification of self-organizing migration algorithm for OpenCL framework 33,678 views
Parallel Ray Tracing Simulations with MATLAB for Dynamic Lens Systems 14,745 views
Data Layout Pruning on GPU 11,844 views
FPGA implementation of a Convolutional Neural Network for "Wake up word" detection 9,272 views
Performance Evaluation of Container-based Virtualization for High Performance Computing Environments 9,236 views
Computing Treewidth on the GPU 8,934 views
OpenCL Programming by Example 8,849 views
Energy efficiency of finite difference algorithms on multicore CPUs, GPUs, and Intel Xeon Phi processors 8,840 views
GMM based Fisher vector calculation on GPGPU 8,828 views
OpenCL Actors – Adding Data Parallelism to Actor-based Programming with CAF 8,792 views
An Efficient Load Balancing Method for Tree Algorithms 8,769 views
Modeling the Resource Requirements of Convolutional Neural Networks on Mobile Devices 8,718 views
GALARIO: a GPU Accelerated Library for Analysing Radio Interferometer Observations 8,562 views
Matrix inversion speed up with CUDA 8,060 views
Torch7: A Matlab-like Environment for Machine Learning 7,815 views
OpenMP Programming on Intel R Xeon Phi TM Coprocessors: An Early Performance Comparison 7,785 views
Code Optimization Techniques for Graphics Processing Units 7,483 views
PySPH: A Python framework for SPH 7,154 views
End-to-end Deep Learning of Optimization Heuristics 7,045 views
Breaking DVB-CSA 7,029 views
Accelerating Radio Astronomy with Auto-Tuning 7,004 views
Monte Carlo methods for massively parallel computers 6,918 views
GPU implementation of a deep learning network for image recognition tasks 6,860 views
IBM Deep Learning Service 6,786 views
Automated Testing of Graphics Shader Compilers 6,726 views
Asynchronous Task-Based Polar Decomposition on Single Node Manycore Architectures 6,706 views
NaNet:a low-latency NIC enabling GPU-based, real-time low level trigger systems 6,680 views
Out-of-core Implementation for Accelerator Kernels on Heterogeneous Clouds 6,649 views
Meta Networks for Neural Style Transfer 6,575 views
Empower Sequence Labeling with Task-Aware Neural Language Model 6,567 views
CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs 6,465 views
Random Forests of Very Fast Decision Trees on GPU for Mining Evolving Big Data Streams 5,490 views
A Comparative Study of 2D Numerical Methods with GPU Computing 5,453 views
Implementing Level-3 BLAS Routines in OpenCL on Different Processing Units 5,252 views
Advanced 2D Rasterization on Modern CPUs 5,225 views
Sorting with GPUs: A Survey 5,215 views
BIDMach: Large-scale Learning with Zero Memory Allocation 5,188 views
An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture 5,167 views
SoAx: A generic C++ Structure of Arrays for handling Particles in HPC Codes 5,142 views
gSLIC: a real-time implementation of SLIC superpixel segmentation 5,138 views
Deep learning for galaxy surface brightness profile fitting 5,134 views
Report: Performance comparison between C2075 and P100 GPU cards using cosmological correlation functions 5,092 views
Fast Parallel Sorting Algorithms on GPUs 5,054 views
Optimization of the Brillouin operator on the KNL architecture 5,053 views
libWater: Heterogeneous Distributed Computing Made Easy 5,049 views
Hydra: a C++11 framework for data analysis in massively parallel platforms 5,017 views
Accelerating Genomics Research with OpenCL and FPGAs 4,967 views
Best Practice Guide – GPGPU 4,910 views
The CUDA Handbook: A Comprehensive Guide to GPU Programming 4,903 views
Implementing Neural Networks Efficiently 4,879 views
Vectorized algorithm for multidimensional Monte Carlo integration on modern GPU, CPU and MIC architectures 4,799 views
Launch-time Optimization of OpenCL Kernels 4,726 views
Synkhronos: a Multi-GPU Theano Extension for Data Parallelism 4,689 views
Radeon PRO Solid State Graphics (SSG) API User Manual 4,671 views
Acceleration of tensor-product operations for high-order finite element methods 4,663 views
Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs 4,658 views
Build and Travel KD-Tree with CUDA 4,626 views
Usage of GPU in LS-DYNA 4,594 views
Scandalously Parallelizable Mesh Generation 4,516 views
Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing 4,501 views
vCUDA Framework Development for GPU Virtualization 4,468 views
Flexible FPGA design for FDTD using OpenCL 4,330 views
Distributed Training Large-Scale Deep Architectures 4,324 views
Efficient 2D Software Rendering 4,306 views
A Dynamic Hash Table for the GPU 4,292 views
OpenCL Programming Guide 4,285 views
DTAM: Dense tracking and mapping in real-time 4,273 views
Hybrid Fortran: High Productivity GPU Porting Framework Applied to Japanese Weather Prediction Model 4,230 views
Parallel Neural Network Training with OpenCL 4,220 views
GPU Pro 6: Advanced Rendering Techniques 4,166 views
Unified Deep Learning with CPU, GPU, and FPGA Technologies 4,162 views
Theano: Deep Learning on GPUs with Python 4,112 views
CUSIMANN: An optimized simulated annealing software for GPUs 4,097 views
Nemo: A parallelized Lagrangian particle-tracking model 4,097 views
HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads 4,065 views
Tesla vs. Xeon Phi vs. Radeon A Compiler Writer’s Perspective 4,062 views
Warps and Atomics: Beyond Barrier Synchronization in the Verification of GPU Kernels 4,046 views
Nengo: a Python tool for building large-scale functional brain models 4,040 views
On Pre-Trained Image Features and Synthetic Images for Deep Learning 4,008 views
Deep Voice 3: 2000-Speaker Neural Text-to-Speech 3,998 views
ChainerMN: Scalable Distributed Deep Learning Framework 3,986 views
A Framework for Productive, Efficient and Portable Parallel Computing 3,985 views
GooFit 2.0 3,973 views
GPU Passthrough Performance: A Comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL Applications 3,970 views
PCIeHLS: an OpenCL HLS framework 3,961 views
Efficient Algorithms for Sorting on GPUs 3,956 views
cudaMap: a GPU accelerated program for gene expression connectivity mapping 3,937 views
A Study of Time and Energy Efficient Algorithms for Parallel and Heterogeneous Computing 3,935 views
Deep and Shallow convections in Atmosphere Models on Intel Xeon Phi Coprocessor Systems 3,932 views
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems 3,892 views
Data Coherence Analysis and Optimization for Heterogeneous Computing 3,863 views
Titles: 100
Total views: 593139
- Programming - 185,522 views
- Login - 147,546 views
- User dashboard - 66,325 views
- Add new post - 47,241 views
- Register - 43,960 views
- Add new event - 40,728 views
- Paper titles list - 35,132 views
- Modification of self-organizing migration algorithm for OpenCL framework - 33,678 views
- Hardware - 26,160 views
- Books on OpenCL and CUDA - 23,388 views