Views of posts on hgpu.org
BENCHIP: Benchmarking Intelligence Processors 4,089 views
Efficient Sparse Matrix-Vector Multiplication on x86-Based Many-Core Processors 4,083 views
ScatterAlloc: Massively Parallel Dynamic Memory Allocation for the GPU 4,078 views
OpenCL vs. OpenMP: A Programmability Debate 4,077 views
Adaptation of an acoustic propagation model to the parallel architecture of a graphics processor 4,066 views
GPU-Powered Coherent Beamforming 4,066 views
Dissecting GPU Memory Hierarchy through Microbenchmarking 4,065 views
A Comparison of Potential Interfaces for Batched BLAS Computations 4,059 views
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs 4,044 views
Optimizing Stencil Computations for NVIDIA Kepler GPUs 4,040 views
Parallel Implementation of Moving Averages and Stock Market Prediction 4,014 views
Lattice QCD on new chips: a community summary 4,009 views
BigKernel — High Performance CPU-GPU Communication Pipelining for Big Data-style Applications 4,002 views
Anisotropic mesh coarsening and refinement on GPU architecture 3,996 views
Enabling High Performance Computing in Cloud Infrastructure using Virtualized GPUs 3,995 views
A Development Platform for Embedded Domain-Specific Languages 3,988 views
A tool for mapping Single Nucleotide Polymorphisms using Graphics Processing Units 3,979 views
clpeak – peak performance of your opencl device 3,970 views
Architecting SOT-RAM Based GPU Register File 3,967 views
Bitmap Filter: Speeding up Exact Set Similarity Joins with Bitwise Operations 3,951 views
Using OpenCL: Programming Massively Parallel Computers 3,948 views
Adaptation of algorithms for underwater sonar data processing to GPU-based systems 3,946 views
Semi-Global Matching-Motivation, Developments and Applications 3,946 views
Hadoop+Aparapi: Making heterogenous MapReduce programming easier 3,937 views
Data Transfer Matters for GPU Computing 3,936 views
GPU Random Numbers via the Tiny Encryption Algorithm 3,920 views
Deterministic Sample Sort For GPUs 3,901 views
A Semi-Automated Tool Flow for Roofline Anaylsis of OpenCL Kernels on Accelerators 3,899 views
Parallel Irradiance Caching on the GPU 3,898 views
GPU Parallelization for Unstructured Sparse Matrix Problems with OpenMP 4.5 and OpenACC 3,895 views
GPU acceleration and performance of the particle-beam-dynamics code Elegant 3,894 views
OpenCL Parallel Programming Development Cookbook 3,877 views
Performance Evaluation of R with Intel Xeon Phi Coprocessor 3,875 views
Introducing CURRENNT – the Munich open-source CUDA RecurREnt Neural Network Toolkit 3,873 views
An Exploratory Study of High Performance Graphics Application Programming Interfaces 3,873 views
A survey on graphic processing unit computing for large-scale data mining 3,869 views
Uses of GPU Powered Interval Optimization for Parameter Identification in the Context of SO Fuel Cells 3,869 views
Efficient Hash Tables on the GPU 3,865 views
A portable implementation of the radix sort algorithm in OpenCL 3,856 views
maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell GPUs 3,849 views
GPU Accelerated Vessel Segmentation Using Laplacian Eigenmaps 3,831 views
Efficient Parallel RSA Decryption Algorithm for Many-core GPUs with CUDA 3,830 views
Atmospheric Chemistry 3,829 views
CUD@SAT: SAT Solving on GPUs 3,805 views
Efficient Inference For Neural Machine Translation 3,800 views
You Can Type, but You Can’t Hide: A Stealthy GPU-based Keylogger 3,794 views
Learning Random Forests on the GPU 3,791 views
A 3D Convex Hull Algorithm for Graphics Hardware 3,785 views
Solving Linear Equations with Conjugate Gradient Method on OpenCL Platforms 3,779 views
Efficient Cubic B-spline Image Interpolation on a GPU 3,777 views
Deep API Learning 3,764 views
GPU Programming in Rust: Implementing High Level Abstractions in a Systems Level Language 3,747 views
Multi-Platform LU-Decomposition Solution in OpenCL 3,739 views
CUDA-OpenGL Interoperability to Visualize Electromagnetic Fields Calculated by FDTD 3,729 views
Accelerating Simulation Codes through the GeMTC Framework 3,727 views
Bilateral Filtering with CUDA 3,724 views
The Hitchhiker’s Guide to Cross-Platform OpenCL Application Development 3,713 views
CL2QCD – Lattice QCD based on OpenCL 3,711 views
Real-Time Hair Simulation and Rendering with OpenCL and OpenGL 3,707 views
GPGPU-Aided 3D Staggered-grid Finite-difference Seismic Wave Modeling 3,698 views
Progressive Photon Mapping on GPUs 3,687 views
CudaRF: A CUDA-based Implementation of Random Forests 3,673 views
Hierarchical belief propagation to reduce search space using CUDA for stereo and motion estimation 3,654 views
Offload Compiler Runtime for the Intel Xeon Phi Coprocessor 3,653 views
Professional CUDA C Programming 3,652 views
High Performance Extreme Learning Machines: A Complete Toolbox for Big Data Applications 3,647 views
Bigger Buffer k-d Trees on Multi-Many-Core Systems 3,644 views
GPU-accelerated computation for robust motion tracking using the CUDA framework 3,639 views
Non-separable 2D, 3D and 4D filtering with CUDA 3,635 views
Hybrid strategy for stencil computations on the APU 3,610 views
Designing Scientific Applications on GPUs 3,610 views
The Virtual OpenCL (VCL) Cluster Platform 3,601 views
Implementation of Just In Time Value Specialization for the Optimization of Data Parallel Kernels 3,583 views
GPU Implementation of a Deep Learning Network for Financial Prediction 3,582 views
State of the Art Report on Real-time Rendering with Hardware Tessellation 3,580 views
Hidden Surface Removal Using BSP Tree with CUDA 3,579 views
A GPU Accelerated Algorithm for Compressive Sensing Based Image Super-Resolution 3,576 views
Deep Learning on FPGAs: Past, Present, and Future 3,576 views
Efficient Hybrid Execution of C++ Applications using Intel(R) Xeon Phi(TM) Coprocessor 3,572 views
Sparse Matrix-Vector Multiplication on GPU 3,570 views
Dogwild! – Distributed Hogwild for CPU & GPU 3,558 views
Parallel and Concurrent Programming in Haskell: Techniques for Multicore and Multithreaded Programming 3,556 views
An OpenCL Runtime and Scheduler for Embedded Multicore DSP Parallel Systems 3,556 views
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 3,552 views
GPU Parallel Collections For Scala 3,551 views
Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi 3,547 views
GPU-ABiSort: Optimal Parallel Sorting on Stream Architectures 3,545 views
GPGPU Acceleration for Skeletal Animation-comparing OpenCL with CUDA and GLSL 3,543 views
High-Performance GPGPU Programming with OCaml 3,529 views
Titles: 100
Total views: 379116
- Programming - 186,126 views
- Login - 164,120 views
- User dashboard - 90,236 views
- Paper titles list - 69,508 views
- Add new event - 64,509 views
- Add new post - 59,082 views
- Register - 49,120 views
- Statistics - 36,124 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,158 views
- Books on OpenCL and CUDA - 28,744 views