Papers on hgpu.org (.txt-file)
Stencil shadow volumes for complex and deformable objects

Stencil-Aware GPU Optimization of Iterative Solvers

StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems

StePS: A Multi-GPU Cosmological N-body Code for Compactified Simulations

Stereo depth with a Unified Architecture GPU

Stereo Matching Algorithm Using Population-Based Incremental Learning on GPU
Stereo Matching using Multi-Resolution Images on CUDA

Stereoscopic Ray Tracing on Graphics Processors

Stereoscopic Scene Flow Computation for 3D Motion Understanding
Stochastic Analysis of a Queue Length Model Using a Graphics Processing Unit

Stochastic Differential Equations simulation using GPU

Stochastic DT-MRI Connectivity Mapping on the GPU

Stochastic Gradient Descent on GPUs

Stochastic Progressive Photon Mapping for Dynamic Scenes

STOCHSIMGPU: Parallel stochastic simulation for the Systems Biology Toolbox 2 for MATLAB

Stock trading strategy creation using GP on GPU

StoreGPU: exploiting graphics processing units to accelerate distributed storage systems

Strain Visualization of Ultra Sound Signals Processed by General Purpose Graphic Process Unit

Strassen’s Matrix Multiplication on GPUs

Strategies for Maximizing Utilization in multi-CPU & multi-GPU Heterogeneous Architectures

Strategies for Optimization of Parallel Programs

Strategies for preparing computer science students for the multicore world

Strategies for Protecting Intellectual Property when Using CUDA Applications on Graphics Processing Units

Strategies for the Heterogeneous Execution of Large-Scale Simulations on Hybrid Supercomputers

Strategies to minimise the total run time of cyclic graph based genetic programming with GPUs

Strategy Preserving Compilation for Parallel Functional Code

Stream computing on graphics hardware

Stream Join Processing on Heterogeneous Processors

Stream processing for fast and efficient rotated Haar-like features using rotated integral images

Stream Processing of Integral Images for Real-Time Object Detection

Stream processing of moment invariants for real-time classifiers
Stream-Centric Stereo Matching and View Synthesis: A High-Speed Approach on GPUs
StreamBlocks: A compiler for heterogeneous dataflow computing

StreamBrain: An HPC Framework for Brain-like Neural Networks on CPUs, GPUs and FPGAs

Streamed Watershed Transform on GPU for Processing of Large Volume Data

Streaming Algorithms for Biological Sequence Alignment on GPUs
Streaming Applications on Heterogeneous Platforms

Streaming architectures and technology trends
Streaming Data from HDD to GPUs for Sustained Peak Performance

Streaming Dynamic Coarse-Grained CPU/GPU Workloads with Heterogeneous Pipelines in FastFlow

Streaming GPU Singular Value and Dynamic Mode Decompositions

Streaming Parallel GPU Acceleration of Large-Scale filter-based Spiking Neural Networks

Streaming-Oriented Parallelization of Domain-Independent Irregular Kernels

STREAMIT: Dynamic visualization and interactive exploration of text streams

Streamlining GPU applications on the fly: thread divergence elimination through runtime thread-data remapping

StreamMR: An Optimized MapReduce Framework for AMD GPUs

StreamWorks: An Energy-efficient Embedded Co-processor for Stream Computing

Strega: An HTTP Server for FPGAs

Stress Tensor Field Visualization for Implant Planning in Orthopedics

Stressing the BER simulation of LDPC codes in the error floor region using GPU clusters

String Matching on a Multicore GPU Using CUDA
Striped Smith-Waterman speeds database searches six times over other SIMD implementations

Strong scaling of general-purpose molecular dynamics simulations on GPUs

Structural Agnostic SpMV: Adapting CSR-Adaptive for Irregular Matrices

Structured Orthogonal Inversion of Block p-Cyclic Matrices on Multicore with GPU Accelerators

STT-RAM for Shared Memory in GPUs

Studies Concerning the ATLAS IBL Calibration Architecture

Studies of quantum dots: Ab initio coupled-cluster analysis using OpenCL and GPU programming

Studies on CUDA Offloading for Real-Time Simulation and Visualization

Study and evaluation of an Irregular Graph Algorithm on Multicore and GPU Processor Architectures

Study and evaluation of improved automatic GPU offloading method

Study for measurement method for coal volume on base of GPU
Study of Bandwidth Partitioning for Co-executing GPU Kernels

Study of basic vector operations on Intel Xeon Phi and NVIDIA Tesla using OpenCL

Study of Convolution Algorithms using CPU and Graphics Hardware

Study of low density nuclear matter with quantum molecular dynamics: the role of the symmetry energy

Study of OpenCL Processing Models for FPGA Devices

Study of Sparse-Matrix Vector Multiplication (SpMV) on Different Architectures and Libraries

Study on acceleration technique for calculating near field of horn antenna based on GPU
Study on acceleration technique for two-dimensional FDTD algorithm based on GPU
Study on GPU-accelerated extraction of interconnects parasitic using CUDA and MPI
Study on semi-global matching algorithm extended for multi baseline matching and parallel processing method based on GPU

Study on volume rendering of CT slices based on ray casting
Study, Modelling and Implementation of the Level Set Method Used in Micromachining Processes

Studying the core-cusp problem in cold dark matter halos using N-body simulations on GPU clusters

Studying the Potential of Automatic Optimizations in the Intel FPGA SDK for OpenCL

Studying Thermal Management for Graphics-Processor Architectures

STuning-DL: Model-Driven Autotuning of Sparse GPU Kernels for Deep Learning

SU(2) Lattice Gauge Theory Simulations on Fermi GPUs

SU(2) Lattice QCD Simulations on Fermi GPUs

Sub-seasonal forecasting with a large ensemble of deep-learning weather prediction models

Subdivision Surface Evaluation as Sparse Matrix-Vector Multiplication

Subpixel reconstruction antialiasing for deferred shading

Suitability of NVIDIA GPUs for SKA1-Low

Super Earths and Dynamical Stability of Planetary Systems: First Parallel GPU Simulations Using GENGA

Supercharging Federated Learning with Flower and NVIDIA FLARE

Supercomputing and stellar dynamics

Supercomputing with toys: harnessing the power of NVIDIA 8800GTX and playstation 3 for bioinformatics problem

Superconducting proximity effect in graphene under inhomogeneous strain

SUPERGLUE: A Shared Memory Framework Using Data Versioning for Dependency-Aware Task-Based Parallelization

SUperman: Efficient Permanent Computation on GPUs

SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks

SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks

Superpipeline: A Universal Approach for Reducing GPU Memory Usage in Large Models

Supervised Hashing with Deep Neural Networks

Titles: 100
open PDFs: 86
packages: 19
