Papers on hgpu.org (.txt-file)
A Runtime Controller for OpenCL Applications on Heterogeneous System Architectures
A Scala Prototype to Generate Multigrid Solver Implementations for Different Problems and Target Multi-Core Platforms
A Scalable and Reconfigurable Shared-Memory Graphics Cluster Architecture
A Scalable Approach to Solving Dense Linear Algebra Problems on Hybrid CPU-GPU Systems
A Scalable End-to-End Optimized Real-Time Image-Based Rendering Framework on Graphics Hardware
A Scalable Framework for Heterogeneous GPU-Based Clusters
A Scalable Framework for Monte Carlo Simulation Using FPGA-based Hardware Accelerators with Application to SPECT Imaging
A Scalable GPU-based Approach to Accelerate the Multiple-Choice Knapsack Problem
A scalable GPU-based approach to shading and shadowing for photorealistic real-time augmented reality
A Scalable graph-cut algorithm for N-D grids
A Scalable Heterogeneous Parallelization Framework for Iterative Local Searches
A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators
A scalable hybrid algorithm based on domain decomposition and algebraic multigrid for solving partial differential equations on a cluster of CPU/GPUs
A Scalable Hybrid FPGA/GPU FX Correlator
A Scalable Lane Detection Algorithm on COTSs with OpenCL
A Scalable Multi-Path Microarchitecture for Efficient GPU Control Flow
A Scalable, Efficient Scheme for Evaluation of Stencil Computations over Unstructured Meshes
A scalable, numerically stable, high-performance tridiagonal solver using GPUs
A scheduling and runtime framework for a cluster of heterogeneous machines with multiple accelerators
A Scheduling Framework for a Heterogeneous Parallel Architecture
A Screen Space Quality Method for Data Abstraction
A scripting language for Digital Content Creation applications
A second generation of DEFG: Declarative Framework for GPUs
A Second-Order Distributed Trotter-Suzuki Solver with a Hybrid Kernel
A Self-Optimizing Framework for Developing Metrology Software on Massive Parallel Processor Architectures
A self-organization based optical flow estimator with GPU implementation
A self-organization based optical flow estimator with GPU implementation (thesis)
A Semi-Automated Tool Flow for Roofline Anaylsis of OpenCL Kernels on Accelerators
A Shader Library for OpenGL 4 and GLSL 4.3 Learning and Development
A shared file system abstraction for heterogeneous architectures
A shared-scene-graph image-warping architecture for VR: Low latency versus image quality
A short guide to CUDA C: For physicists with multi-core graphics cards
A Short Note on Gaussian Process Modeling for Large Datasets using Graphics Processing Units
A SIMD Interpreter for Genetic Programming on GPU Graphics Cards
A SIMD-efficient 14 instruction shader program for high-throughput microtriangle rasterization
A Similarity Measure for GPU Kernel Subgraph Matching
A Similarity-Based Analysis Tool for Scientific Application Porting
A simple and efficient way to compute depth maps for multi-view videos
A simple and flexible volume rendering framework for graphics-hardware-based raycasting
A simple GPU-based approach for 3D Voronoi diagram construction and visualization
A simple method to accelerate fringe analysis algorithms based on graphics processing unit and MATLAB
A Simplified and Accurate Model of Power-Performance Efficiency on Emergent GPU Architectures
A Simulation Framework for Scheduling Performance Evaluation on CPU-GPU Heterogeneous System
A simulation suite for lattice Boltzmann based real time CFD applications exploiting multi-level parallelism on modern multi-and many-core architectures
A Simulation Suite for Lattice-Boltzmann based Real-Time CFD Applications Exploiting Multi-Level Parallelism on modern Multi- and Many-Core Architectures
A Simulator for the Cafadis Real Time 3DTV Camera
A Single (Unified) Shader GPU Microarchitecture for Embedded Systems
A small-world network model for distributed storage of semantic metadata
A Smart GPU Implementation of an Elliptic Kernel for an Ocean Global Circulation Model
A smooth particle hydrodynamics code to model collisions between solid, self-gravitating objects
A Software Framework for the Detection and Classification of Biological Targets in Bio-Nano Sensing
A Software-Based Self Test of CUDA Fermi GPUs
A Sorting Library for FPGA Implementation in OpenCL Programming
A Sparse Matrix Personality for the Convey HC-1
A sparse octree gravitational N-body code that runs entirely on the GPU processor
A Spiking Neural P system simulator based on CUDA
A Splitting Algorithm for Directional Regularization and Sparsification
A stand-alone Finite Difference Time Domain (FDTD) simulation for Integrated Optoelectronics Laboratory
A state-of-the-art password strength analysis demonstrator
A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning
A Static Load Balancing Scheme for Parallel Volume Rendering on Multi-GPU Clusters
A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL
A Stencil DSEL for Single Code Accelerated Computing with SYCL
A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA
A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPU-GPU
A stereoscopic movie player with real-time content adaptation to the display geometry
A Stochastic-based Optimized Schwarz Method for the Gravimetry Equations on GPU Clusters
A straightforward CUDA implementation for interactive ray-tracing
A Straightforward Preprocessing Approach for Accelerating Convex Hull Computations on the GPU
A Strategy for Automatic Performance Tuning of Stencil Computations on GPUs
A Strategy for Automatically Generating High Performance CUDA Code for a GPU Accelerator from a Specialized Fortran Code Expression
A Stream Processor Cluster Architecture Model with the Hybrid Technology of MPI and CUDA
A stream-computing extension to OpenMP
A streaming model for nested data parallelism
A streaming narrow-band algorithm: interactive computation and visualization of level sets
A structural analysis of the A5/1 state transition graph
A structured parallel periodic arnoldi shooting algorithm for RF-PSS analysis based on GPU platforms
A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers
A Study of CUDA Acceleration and Impact of Data Transfer Overhead in Heterogeneous Environment
A Study of Data Partitioning on OpenCL-based FPGAs
A study of integer sorting on multicores
A Study of Mixed Precision Strategies for GMRES on GPUs
A study of parallel evolution strategy: pattern search on a GPU computing platform
A Study of Parallel Sorting Algorithms Using CUDA and OpenMP
A Study of Performance Programming of CPU, GPU accelerated Computers and SIMD Architecture
A Study of Productivity and Performance of Modern Vector Processors
A Study of Real-Time Lighting Effects
A Study of Scheduling a Neuro-imaging Application On a Heterogeneous CPU-GPU Cluster
A Study of Single and Multi-device Synchronization Methods in Nvidia GPUs
A Study of Successive Over-relaxation Method Parallelization Over Modern HPC Languages
A Study of the Parallelization of Hybrid SAT Solver using CUDA
A Study of the Potential of Locality-Aware Thread Scheduling for GPUs
A study of the speed and the accuracy of the Boundary Element Method as applied to the computational simulation of biological organs
A Study of Time and Energy Efficient Algorithms for Parallel and Heterogeneous Computing
A Study on Efficient Application Mapping on Parallel Computing Accelerators
A Study on GPU Computing and Accelerating Simulation of Sedimentary Rock Structure
A Study on Neural-based Code Summarization in Low-resource Settings
A Study on Parallel Imaging Algorithm of 3D Geological Data
A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation
Titles: 100
open PDFs: 93
packages: 13