Papers on hgpu.org (.txt-file)
A particle-based method for viscoelastic fluids animation
A pattern recognition system for prostate mass spectra discrimination based on the CUDA parallel programming model

A Pattern Specification and Optimizations Framework for Accelerating Scientific Computations on Heterogeneous Clusters

A PC-based fully-programmable medical ultrasound imaging system using a graphics processing unit
A PCG Implementation of an Elliptic Kernel in an Ocean Global Circulation Model Based on GPU Libraries

A Performance Analysis Framework for Identifying Potential Benefits in GPGPU Applications

A Performance Analysis Framework for Optimizing OpenCL Applications on FPGAs

A Performance and Scalability Analysis of the Tsunami Simulation EasyWave for Different Multi-Core Architectures and Programming Models

A Performance Comparison of Algebraic Multigrid Preconditioners on CPUs, GPUs, and Xeon Phis

A Performance Comparison of CUDA and OpenCL

A Performance Comparison of Different Graphics Processing Units Running Direct N-Body Simulations

A Performance Comparison of Sort and Scan Libraries for GPUs

A Performance Criteria for parallel Computation on basis of block size using CUDA Architecture

A Performance Model and Optimization Strategies for Automatic GPU Code Generation of PDE Systems Described by a Domain-Specific Language

A Performance Model for Memory Bandwidth Constrained Applications on Graphics Engines

A Performance Model for the Communication in Fast Multipole Methods on HPC Platforms

A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs

A Performance Optimization Support Framework for GPU-based Traffic Simulations with Negotiating Agents

A Performance Portable Matrix Free Dense MTTKRP in GenTen

A performance prediction model for the CUDA GPGPU platform

A performance spectrum for parallel computational frameworks that solve PDEs

A Performance Study for Iterative Stencil Loops on GPUs with Ghost Zone Optimizations

A performance study of general-purpose applications on graphics processors using CUDA

A Performance Study of Zero Crossing Rate (ZCR) on Graphics Processors (GPUs) Using CUDA

A Performance-Portable SYCL Implementation of CRK-HACC for Exascale

A performance/cost evaluation for a GPU-based drug discovery application on volunteer computing

A Personal Surround Environment: Projective Display with Correction for Display Surface Geometry and Extreme Lens Distortion

A Pervasive Parallel Framework for Visualization

A pilgrimage to gravity on GPUs

A platform-independent tool for modeling parallel programs

A Polyphase Filter For GPUs And Multi-Core Processors

A polyphase filter for many-core architectures

A portable and high-performance matrix operations library for CPUs, GPUs and beyond

A portable C++ library for memory and compute abstraction on multi-core CPUs and GPUs

A Portable High-Productivity Approach to Program Heterogeneous Systems

A portable implementation of the radix sort algorithm in OpenCL

A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures

A portable platform for accelerated PIC codes and its application to GPUs using OpenACC

A Power Efficient Neural Network Implementation on Heterogeneous FPGA and GPU Devices

A power-aware symbiotic scheduling algorithm for concurrent GPU kernels

A Power-Efficient Scheduling Approach in a Cpu-Gpu Computing System by Thread-Based Parallel Programming

A practical and robust bump-mapping technique for today’s GPU’s

A practical approach of curved ray prestack Kirchhoff Time Migration on GPGPU

A practical multi-viewer tabletop autostereoscopic display

A Practical Performance Model for Compute and Memory Bound GPU Kernels

A Practical Quicksort Algorithm for Graphics Processors

A Practical Visualization Strategy for Large-Scale Supernovae CFD Simulations

A Practical, Targeted, and Stealthy Attack Against WPA Enterprise Authentication

A Predictive Model for Solving Small Linear Algebra Problems in GPU Registers

A Predictive Shutdown Technique for GPU Shader Processors
A Preliminary Review of Literature on Parallel Constraint Solving

A preliminary study of OpenCL for accelerating CT reconstruction and image recognition
A Problem-Based Learning Approach to GPU Computing

A Program Behavior Study of Block Cryptography Algorithms on GPGPU

A Programmable Processing Array Architecture Supporting Dynamic Task Scheduling and Module-Level Prefetching

A programming framework for data streaming on the Xeon Phi

A programming language interface to describe transformations and code generation

A Programming Model for GPU Load Balancing

A programming model for GPU-based parallel computing with scalability and abstraction

A progressive mesh method for physical simulations using lattice Boltzmann method on single-node multi-gpu architectures

A prototyping environment for high performance reconfigurable computing

A pseudospectral matrix method for time-dependent tensor fields on a spherical shell

A pure vision-based approach to topological SLAM

A Push-Relabel-Based Maximum Cardinality Bipartite Matching Algorithm on GPUs

A Qualitative Comparison Study Between Common GPGPU Frameworks

A Quantitative Comparison of Emulated Shared Memory Architectures to Current Multicore CPUs and GPUs

A Quantitative Performance Analysis Model for GPU Architectures

A Quantitative Study of Irregular Programs on GPUs

A Quasi-Parallel GPU-Based Algorithm for Delaunay Edge-Flips

A QUDA-branch to compute disconnected diagrams in GPUs

A Ray Tracing Implementation Performance Comparison between the CPU and the GPU

A readahead prefetcher for GPU file system layer

A real time Breast Microwave Radar imaging reconstruction technique using simt based interpolation

A real-time 1080p 2D-to-3D video conversion system

A real-time augmented view synthesis system for transparent car pillars

A Real-Time Capable Software-Defined Receiver Using GPU for Adaptive Anti-Jam GPS Sensors

A real-time coarse-to-fine multiview capture system for all-in-focus rendering on a light-field display

A Real-time Coherent Dedispersion Pipeline for the Giant Metrewave Radio Telescope

A Real-Time Computer Vision Library for Heterogeneous Processing Environments

A Real-time GPU Implementation of the SIFT Algorithm for Large-Scale Video Analysis Tasks

A Real-Time Multigrid Finite Hexahedra Method for Elasticity Simulation using CUDA
A Real-Time ProCam System for Interaction with Chinese Ink-and-Wash Cartoons

A real-time procedural shading system for programmable graphics hardware

A Real-time Single Pulse Detection Algorithm for GPUs

A Real-Time Soft Shadow Rendering Algorithm by Occluder-Discretization
A real-time subsurface scattering rendering method for dynamic objects
A Real-Time, GPU-Based, Non-Imaging Back-End for Radio Telescopes

A realtime GPU subdivision kernel

A Reconfigurable GPU Implementation for Tomlinson-Harashima Precoding

A Reconfigurable Processor for Phylogenetic Inference
A reduced order explicit dynamic finite element algorithm for surgical simulation
A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing

A refactoring tool to extract GPU kernels

A Region Growing Segmentation Algorithm for GPUs

A Reliable Throughput Gain on GPUs

A rendering method for simulated emission nebulae

A Reproducible Research Methodology for Designing and Conducting Faithful Simulations of Dynamic HPC Applications

A Research of MapReduce with GPU Acceleration

A Resource Selection System for Cycle Stealing in GPU Grids

Titles: 100
open PDFs: 91
packages: 15
