2402

Views of posts on hgpu.org

OpenCL framework for a CPU, GPU, and FPGA Platform  2,364 views

Parallel Implementation of Travelling Salesman Problem using Ant Colony Optimization  2,363 views

UNICORN: A Bulk Synchronous Programming Model, Framework and Runtime for Hybrid CPU-GPU Clusters  2,363 views

Deep Architectures for Neural Machine Translation  2,363 views

CUDA-based Signed Distance Field Calculation for Adaptive Grids  2,363 views

PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation  2,363 views

Aristotle: A Performance Impact Indicator for the OpenCL Kernels Using Local Memory  2,363 views

Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems  2,363 views

Interactive Ray-tracing Based on OptiX to Visualize Signed Distance Fields  2,363 views

3D simulation of complex shading affecting PV systems taking benefit from the power of graphics cards developed for the video game industry  2,361 views

Accounting for Secondary Uncertainty: Efficient Computation of Portfolio Risk Measures on Multi and Many Core Architectures  2,361 views

GPU-accelerated Database Systems: Survey and Open Challenges  2,361 views

Duality based optical flow algorithms with applications  2,361 views

A Domain Specific Language for Performance Portable Molecular Dynamics Algorithms  2,360 views

String Algorithm on GPGPU  2,360 views

Monte Carlo simulations on Graphics Processing Units  2,360 views

Enhanced Parallel NegaMax Tree Search Algorithm on GPU  2,360 views

Scheduling a Parallel Sparse Direct Solver to Multiple GPUs  2,360 views

OpenCL based machine learning labeling of biomedical datasets  2,360 views

A Performance Comparison of Different Graphics Processing Units Running Direct N-Body Simulations  2,360 views

LookNN: Neural Network with No Multiplication  2,358 views

Bamboo: Automatic Translation of MPI Source into a Latency-Tolerant Form  2,358 views

Energy Efficiency Studies of Mont Blanc Applications  2,357 views

Singular value decomposition on GPU using CUDA  2,357 views

A GPU based real-time video compression method for video conferencing  2,357 views

Practical Symbolic Race Checking of GPU Programs  2,357 views

Processing Big Data in Main Memory and on GPU  2,356 views

One OpenCL to Rule Them All?  2,356 views

Gdev: First-Class GPU Resource Management in the Operating System  2,355 views

A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-volatile On-chip Caches  2,354 views

Spatial Join with R-Tree on Graphics Processing Units  2,354 views

Future of GPGPU Micro-Architectural Parameters  2,354 views

March of the Froblins: simulation and rendering massive crowds of intelligent and detailed creatures on GPU  2,353 views

DenseCut: Densely Connected CRFs for Realtime GrabCut  2,353 views

Scaling LAPACK panel operations using parallel cache assignment  2,352 views

A Spiking Neural P system simulator based on CUDA  2,352 views

fairseq: A Fast, Extensible Toolkit for Sequence Modeling  2,352 views

Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning  2,351 views

A GPU Tool for Efficient, Accurate, and Realistic Simulation of Cone Beam CT Projections  2,351 views

The Q Continuum Simulation: Harnessing the Power of GPU Accelerated Supercomputers  2,351 views

NVIDIA Tensor Core Programmability, Performance & Precision  2,351 views

Review: Kd-tree Traversal Algorithms for Ray Tracing  2,350 views

AccFFT: A library for distributed-memory FFT on CPU and GPU architectures  2,350 views

libcloudph++ 0.1: single-moment bulk, double-moment bulk, and particle-based warm-rain microphysics library in C++  2,350 views

Task scheduling in hybrid CPU-GPU systems  2,349 views

Binaural Simulations Using Audio Rate FDTD Schemes and CUDA  2,349 views

NUPAR: A Benchmark Suite for Modern GPU Architectures  2,349 views

A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA  2,349 views

Accelerating GPU Implementation of Contourlet Transform  2,348 views

Feature Extraction and Visualization from Higher-Order CFD Data  2,348 views

Parallel Support Vector Machines in Practice  2,348 views

DistCL: A Framework for the Distributed Execution of OpenCL Kernels  2,348 views

SqueezCL: Squeezing OpenCL Kernels for Approximate Computing on Contemporary GPUs  2,347 views

Acceleration of Monte-Carlo Molecular Simulations on Hybrid Computing Architectures  2,347 views

A numerical tour of wave propagation  2,347 views

Massively parallel Monte Carlo for many-particle simulations on GPUs  2,347 views

Analyzing CUDA workloads using a detailed GPU simulator  2,347 views

Parallel one-versus-rest SVM training on the GPU  2,347 views

Application of Deep-Learning to Compiler-Based Graphs  2,347 views

Fine-Grained Synchronizations and Dataflow Programming on GPUs  2,347 views

GPU Implementation of Real-Time Biologically Inspired Face Detection using CUDA  2,346 views

Towards Automatic Transformation of Legacy Scientific Code into OpenCL for Optimal Performance on FPGAs  2,345 views

MR-API: A Comprehensive API Framework for Heterogeneous Multi-core Systems using Map Reduce Programming Model  2,344 views

Fast and Flexible GPU Accelerated Binding Free Energy Calculations within the AMBER Molecular Dynamics Package  2,344 views

GpuCV: A GPU-Accelerated Framework for Image Processing and Computer Vision  2,344 views

An efficient KNN algorithm implemented on FPGA based heterogeneous computing system using OpenCL  2,343 views

Parallel Banding Algorithm to compute exact distance transform with the GPU  2,343 views

Implementation of Motion Estimation Based on Heterogeneous Parallel Computing System with OpenCL  2,343 views

3D-color video camera  2,342 views

A GPU accelerated spring mass system for surgical simulation  2,342 views

Face Detection with Improved Local Binary Patterns in CUDA  2,342 views

The FFT on a GPU  2,342 views

Energy-efficient Computing on Distributed GPUs using Dynamic Parallelism and GPU-controlled Communication  2,341 views

Grover: Looking for Performance Improvement by Disabling Local Memory Usage in OpenCL Kernels  2,341 views

GiMMiK – Generating Bespoke Matrix Multiplication Kernels for Various Hardware Accelerators; Applications in High-Order Computational Fluid Dynamics  2,340 views

Fast and scalable list ranking on the GPU  2,340 views

A collision detection algorithm using adaptive particle sensor  2,340 views

A Parallel Algorithm for Calculation of Large Determinants with High Accuracy for GPUs and MPI clusters  2,340 views

Pseudo Random Number Generators on Graphics Processing Units, with Applications in Finance  2,339 views

Discrete Shearlet Transform on GPU with Applications in Anomaly Detection and Denoising  2,339 views

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning  2,339 views

Graphics Programming on the Web WebCL Course Notes  2,339 views

Exascale Deep Learning for Climate Analytics  2,339 views

Computation on GPU of Eigenvalues and Eigenvectors of a Large Number of Small Hermitian Matrices  2,338 views

Design of a Hybrid Memory System for General-Purpose Graphics Processing Units  2,337 views

Array Program Transformation with Loo.py by Example: High-Order Finite Elements  2,337 views

gSLICr: SLIC superpixels at over 250Hz  2,337 views

The Bones Source-to-Source Compiler Manual  2,336 views

Parallelized Seeded Region Growing using CUDA  2,336 views

An Optimal Offline Permutation Algorithm on the Hierarchical Memory Machine, with the GPU implementation  2,336 views

Hybrid Parallel Streamline Extraction Combining MPI and OpenCL  2,336 views

Facial Recognition Using Neural Networks over GPGPU  2,336 views

A comparative study of GPU programming models and architectures using neural networks  2,335 views

Fast hydrodynamics on heterogenous many-core hardware  2,335 views

PErasure: a Parallel Cauchy Reed-Solomon Coding Library for GPUs  2,335 views

Accelerating Exact Similarity Search on CPU-GPU Systems  2,334 views

An Execution Model for OpenCL 2.0  2,334 views

Running Financial Risk Management Applications on FPGA in the Amazon Cloud  2,334 views

Togpu: Automatic Source Transformation from C++ to CUDA using Clang/LLVM  2,333 views

Real-Time Surface Extraction and Visualization of Medical Images using OpenCL and GPUs  2,333 views

 

Brief statistics for this page

Titles: 100

Total views: 234844

 

Most viewed items:

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: