Views of posts on hgpu.org
From MPI to MPI+OpenACC: Conversion of a legacy FORTRAN PCG solver for the spherical Laplace equation 3,026 views
Embedding OpenCL in GHC Haskell 3,023 views
How to Benefit from AMD, Intel and Nvidia Accelerator Technologies in Scilab 3,020 views
A parallel algorithm for implicit depletant simulations 3,018 views
OpenCL Fast Fourier Transform 3,018 views
Massively Parallel A* Search on a GPU 3,018 views
Computing resultants on Graphics Processing Units: Towards GPU-accelerated computer algebra 3,015 views
A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs 3,013 views
A Survey of CPU-GPU Heterogeneous Computing Techniques 3,012 views
GPGPU Performance and Power Estimation Using Machine Learning 3,012 views
Training Neural Networks Without Gradients: A Scalable ADMM Approach 3,011 views
A GPU-based Approximate SVD Algorithm 3,010 views
An open source finite-difference time-domain solver for room acoustics using graphics processing units 3,009 views
Implementing a Photorealistic Rendering System using GLSL 3,008 views
FIR filtering and AES encryption with OpenCL 2.0 3,007 views
Weighted Residuals for Very Deep Networks 3,005 views
Are Very Deep Neural Networks Feasible on Mobile Devices? 3,005 views
Monte Carlo integration on GPU 3,004 views
Distributed OpenCL: a platform for distributed, heterogeneous computing for domain scientists 3,004 views
A Comparative Study of Game Tree Searching Methods 3,002 views
Performance Drawbacks for Matrix Multiplication using Set Associative Cache in GPU devices 3,000 views
TABLA: A Unified Template-based Framework for Accelerating Statistical Machine Learning 3,000 views
An implementation of level set based topology optimization using GPU 2,996 views
Batched Shift Reduce Parsing with Lists of Vectors on CUDA 2,996 views
DeepMetabolism: A Deep Learning System to Predict Phenotype from Genome Sequencing 2,993 views
Sparse Matrix Matrix Multiplication on Hybrid CPU+GPU Platforms 2,991 views
Brook for GPUs: Stream Computing on Graphics Hardware 2,990 views
Accelerating video decoding using GPU 2,989 views
Performance Portability Evaluation for OpenACC on Intel Knights Corner and Nvidia Kepler 2,987 views
Ocean wave simulation in real-time using GPU 2,987 views
Two Approaches to Particle Simulation: OpenMPI and CUDA 2,981 views
GPU-Acceleration of Linear Algebra using OpenCL 2,980 views
Face Recognition: A Tutorial on Computational Aspects 2,979 views
Large Integer Arithmetic in GPU for Cryptography 2,978 views
A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures 2,977 views
Accelerating MapReduce on a coupled CPU-GPU architecture 2,977 views
GPU-SD and DPD Parallelization for Gromacs tools for molecular dynamics simulations 2,975 views
A compiler toolkit for array-based languages targeting CPU/GPU hybrid systems 2,974 views
Sparselet Models for Efficient Multiclass Object Detection 2,973 views
OpenMP for Accelerators 2,970 views
RTX Beyond Ray Tracing: Exploring the Use of Hardware Ray Tracing Cores for Tet-Mesh Point Location 2,970 views
A volume segmentation approach based on GrabCut 2,965 views
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU 2,965 views
Parallel face Detection and Recognition on GPU 2,963 views
Template Library for Multi-GPU Pseudorandom Number Recursion-based Generators 2,961 views
Rootbeer: Seamlessly using GPUs from Java 2,959 views
Fast On-line Statistical Learning on a GPGPU 2,958 views
HCudaBLAST: an implementation of BLAST on Hadoop and Cuda 2,956 views
Efficient Implementation of Bi-directional Path Tracer on GPU 2,954 views
Langevin dynamics simulations of biomolecules on graphics processors 2,953 views
SparkCL: A Unified Programming Framework for Accelerators on Heterogeneous Clusters 2,952 views
Towards GPU-Accelerated Large-Scale Graph Processing in the Cloud 2,950 views
Efficient GPGPU-based parallel packet classification 2,950 views
C++ AMP: Accelerated Massive Parallelism with Microsoft Visual C++ 2,949 views
Using GPU-based Computing To Accelerate Finite Element Problems 2,947 views
Writing a performance-portable matrix multiplication 2,946 views
Large scale parallel state space search utilizing graphics processing units and solid state disks 2,945 views
gem5-gpu: A Heterogeneous CPU-GPU Simulator 2,942 views
Connected component labeling on a 2D grid using CUDA 2,936 views
GPUDet: A Deterministic GPU Architecture 2,935 views
Implementation of a Lattice–Boltzmann method for numerical fluid mechanics using the nVIDIA CUDA technology 2,925 views
A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data 2,924 views
Fast 2-D Ultrasound Strain Imaging: The Benefits of Using a GPU 2,923 views
A (ir)regularity-aware task scheduler for heterogeneous platforms 2,923 views
eccCL: parallelized GPU implementation of Ensemble Classifier Chains 2,923 views
clRNG: A Random Number API with Multiple Streams for OpenCL 2,923 views
ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time Scale 2,923 views
Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing 2,922 views
GPU Computing Gems: Emerald Edition 2,921 views
Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor 2,917 views
New Sparse Matrix Storage Format to Improve The Performance of Total SPMV Time 2,916 views
Forecasting high frequency financial time series using parallel FFN with CUDA and ZeroMQ 2,916 views
Fast Hair Simulation and Rendering Using CUDA and OpenGL 2,916 views
Real-Time Incompressible Fluid Simulation on the GPU 2,915 views
GPU-based 3D Wavelet Transform 2,914 views
Accelerating Random Forests on CPUs and GPUs for Object-Class Image Segmentation 2,913 views
Automatic Translation of CUDA to OpenCL and Comparison of Performance Optimizations on GPUs 2,912 views
FFT and Convolution Performance in Image Filtering on GPU 2,911 views
Monte-Carlo Black-Scholes Implementation using OpenCL Standard 2,906 views
Up to 700k GPU cores, Kepler, and the Exascale future for simulations of star clusters around black holes 2,904 views
GPUfs: Integrating a File System with GPUs 2,903 views
Multithreading for Visual Effects 2,903 views
Exposure Render: An Interactive Photo-Realistic Volume Rendering Framework 2,903 views
Canny edge detection on NVIDIA CUDA 2,903 views
CUDA-Based Jacobi’s Iterative Method 2,902 views
A Performance Model for Memory Bandwidth Constrained Applications on Graphics Engines 2,898 views
CUDA cuts: Fast graph cuts on the GPU 2,898 views
Multi-level Parallelism for Incompressible Flow Computations on GPU Clusters 2,894 views
Optimizing ASP.NET with C++ AMP on the GPU 2,894 views
vSMC: Parallel Sequential Monte Carlo in C++ 2,890 views
A GPU-based framework for efficient image processing 2,890 views
Massive Parallel Implementation of ODE Solvers 2,890 views
Automatic Test Case Reduction for OpenCL 2,889 views
GPU Scripting and Code Generation with PyCUDA 2,884 views
Genetically Improved CUDA kernels for StereoCamera 2,880 views
Rodinia: A benchmark suite for heterogeneous computing 2,879 views
TensorFlow: A system for large-scale machine learning 2,879 views
Titles: 100
Total views: 295258
- Programming - 186,129 views
- Login - 164,346 views
- User dashboard - 90,579 views
- Paper titles list - 69,997 views
- Add new event - 64,575 views
- Add new post - 59,313 views
- Register - 49,174 views
- Statistics - 36,455 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,165 views
- Books on OpenCL and CUDA - 28,806 views