Papers on hgpu.org (.txt-file)
A Performance and Scalability Analysis of the Tsunami Simulation EasyWave for Different Multi-Core Architectures and Programming Models
A Performance Comparison of Algebraic Multigrid Preconditioners on CPUs, GPUs, and Xeon Phis
A Performance Comparison of CUDA and OpenCL
A Performance Comparison of Different Graphics Processing Units Running Direct N-Body Simulations
A Performance Comparison of Sort and Scan Libraries for GPUs
A Performance Criteria for parallel Computation on basis of block size using CUDA Architecture
A Performance Model and Optimization Strategies for Automatic GPU Code Generation of PDE Systems Described by a Domain-Specific Language
A Performance Model for Memory Bandwidth Constrained Applications on Graphics Engines
A Performance Model for the Communication in Fast Multipole Methods on HPC Platforms
A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs
A Performance Optimization Support Framework for GPU-based Traffic Simulations with Negotiating Agents
A performance prediction model for the CUDA GPGPU platform
A performance spectrum for parallel computational frameworks that solve PDEs
A Performance Study for Iterative Stencil Loops on GPUs with Ghost Zone Optimizations
A performance study of general-purpose applications on graphics processors using CUDA
A Performance Study of Zero Crossing Rate (ZCR) on Graphics Processors (GPUs) Using CUDA
A Performance-Portable SYCL Implementation of CRK-HACC for Exascale
A performance/cost evaluation for a GPU-based drug discovery application on volunteer computing
A Personal Surround Environment: Projective Display with Correction for Display Surface Geometry and Extreme Lens Distortion
A Pervasive Parallel Framework for Visualization
A pilgrimage to gravity on GPUs
A platform-independent tool for modeling parallel programs
A Polyphase Filter For GPUs And Multi-Core Processors
A polyphase filter for many-core architectures
A portable and high-performance matrix operations library for CPUs, GPUs and beyond
A portable C++ library for memory and compute abstraction on multi-core CPUs and GPUs
A Portable High-Productivity Approach to Program Heterogeneous Systems
A portable implementation of the radix sort algorithm in OpenCL
A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures
A portable platform for accelerated PIC codes and its application to GPUs using OpenACC
A Power Efficient Neural Network Implementation on Heterogeneous FPGA and GPU Devices
A power-aware symbiotic scheduling algorithm for concurrent GPU kernels
A practical and robust bump-mapping technique for today’s GPU’s
A practical approach of curved ray prestack Kirchhoff Time Migration on GPGPU
A practical multi-viewer tabletop autostereoscopic display
A Practical Performance Model for Compute and Memory Bound GPU Kernels
A Practical Quicksort Algorithm for Graphics Processors
A Practical Visualization Strategy for Large-Scale Supernovae CFD Simulations
A Practical, Targeted, and Stealthy Attack Against WPA Enterprise Authentication
A Predictive Model for Solving Small Linear Algebra Problems in GPU Registers
A Predictive Shutdown Technique for GPU Shader Processors
A Preliminary Review of Literature on Parallel Constraint Solving
A preliminary study of OpenCL for accelerating CT reconstruction and image recognition
A Problem-Based Learning Approach to GPU Computing
A Program Behavior Study of Block Cryptography Algorithms on GPGPU
A Programmable Processing Array Architecture Supporting Dynamic Task Scheduling and Module-Level Prefetching
A programming framework for data streaming on the Xeon Phi
A programming language interface to describe transformations and code generation
A Programming Model for GPU Load Balancing
A programming model for GPU-based parallel computing with scalability and abstraction
A progressive mesh method for physical simulations using lattice Boltzmann method on single-node multi-gpu architectures
A prototyping environment for high performance reconfigurable computing
A pseudospectral matrix method for time-dependent tensor fields on a spherical shell
A pure vision-based approach to topological SLAM
A Push-Relabel-Based Maximum Cardinality Bipartite Matching Algorithm on GPUs
A Qualitative Comparison Study Between Common GPGPU Frameworks
A Quantitative Comparison of Emulated Shared Memory Architectures to Current Multicore CPUs and GPUs
A Quantitative Performance Analysis Model for GPU Architectures
A Quantitative Study of Irregular Programs on GPUs
A Quasi-Parallel GPU-Based Algorithm for Delaunay Edge-Flips
A QUDA-branch to compute disconnected diagrams in GPUs
A Ray Tracing Implementation Performance Comparison between the CPU and the GPU
A readahead prefetcher for GPU file system layer
A real time Breast Microwave Radar imaging reconstruction technique using simt based interpolation
A real-time 1080p 2D-to-3D video conversion system
A real-time augmented view synthesis system for transparent car pillars
A Real-Time Capable Software-Defined Receiver Using GPU for Adaptive Anti-Jam GPS Sensors
A real-time coarse-to-fine multiview capture system for all-in-focus rendering on a light-field display
A Real-time Coherent Dedispersion Pipeline for the Giant Metrewave Radio Telescope
A Real-Time Computer Vision Library for Heterogeneous Processing Environments
A Real-time GPU Implementation of the SIFT Algorithm for Large-Scale Video Analysis Tasks
A Real-Time Multigrid Finite Hexahedra Method for Elasticity Simulation using CUDA
A Real-Time ProCam System for Interaction with Chinese Ink-and-Wash Cartoons
A real-time procedural shading system for programmable graphics hardware
A Real-time Single Pulse Detection Algorithm for GPUs
A Real-Time Soft Shadow Rendering Algorithm by Occluder-Discretization
A real-time subsurface scattering rendering method for dynamic objects
A Real-Time, GPU-Based, Non-Imaging Back-End for Radio Telescopes
A realtime GPU subdivision kernel
A Reconfigurable GPU Implementation for Tomlinson-Harashima Precoding
A Reconfigurable Processor for Phylogenetic Inference
A reduced order explicit dynamic finite element algorithm for surgical simulation
A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing
A refactoring tool to extract GPU kernels
A Region Growing Segmentation Algorithm for GPUs
A Reliable Throughput Gain on GPUs
A rendering method for simulated emission nebulae
A Reproducible Research Methodology for Designing and Conducting Faithful Simulations of Dynamic HPC Applications
A Research of MapReduce with GPU Acceleration
A Resource Selection System for Cycle Stealing in GPU Grids
A Resource-Efficient Computing Paradigm for Computational Protein Modeling Applications
A Restructuring Algorithm for CUDA
A Reverse-Projecting Pixel-Level Painting Algorithm
A Review of CUDA, MapReduce, and Pthreads Parallel Computing Models
A Review of the Parallelization Strategies for Iterative Algorithms
A Review on Parallelization of Node based Game Tree Search Algorithms on GPU
A Rigid Body Physics Engine for Interactive Applications
A Road Marking Extraction Method Using GPGPU
A Run-Time Adaptive FPGA Architecture for Monte Carlo Simulations
Titles: 100
open PDFs: 92
packages: 14