Papers on hgpu.org (.txt-file)
Stream processing of moment invariants for real-time classifiers
Stream-Centric Stereo Matching and View Synthesis: A High-Speed Approach on GPUs
StreamBlocks: A compiler for heterogeneous dataflow computing
StreamBrain: An HPC Framework for Brain-like Neural Networks on CPUs, GPUs and FPGAs
Streamed Watershed Transform on GPU for Processing of Large Volume Data
Streaming Algorithms for Biological Sequence Alignment on GPUs
Streaming Applications on Heterogeneous Platforms
Streaming architectures and technology trends
Streaming Data from HDD to GPUs for Sustained Peak Performance
Streaming Dynamic Coarse-Grained CPU/GPU Workloads with Heterogeneous Pipelines in FastFlow
Streaming GPU Singular Value and Dynamic Mode Decompositions
Streaming Parallel GPU Acceleration of Large-Scale filter-based Spiking Neural Networks
Streaming-Oriented Parallelization of Domain-Independent Irregular Kernels
STREAMIT: Dynamic visualization and interactive exploration of text streams
Streamlining GPU applications on the fly: thread divergence elimination through runtime thread-data remapping
StreamMR: An Optimized MapReduce Framework for AMD GPUs
StreamWorks: An Energy-efficient Embedded Co-processor for Stream Computing
Strega: An HTTP Server for FPGAs
Stress Tensor Field Visualization for Implant Planning in Orthopedics
Stressing the BER simulation of LDPC codes in the error floor region using GPU clusters
String Matching on a Multicore GPU Using CUDA
Striped Smith-Waterman speeds database searches six times over other SIMD implementations
Strong scaling of general-purpose molecular dynamics simulations on GPUs
Structural Agnostic SpMV: Adapting CSR-Adaptive for Irregular Matrices
Structured Orthogonal Inversion of Block p-Cyclic Matrices on Multicore with GPU Accelerators
STT-RAM for Shared Memory in GPUs
Studies Concerning the ATLAS IBL Calibration Architecture
Studies of quantum dots: Ab initio coupled-cluster analysis using OpenCL and GPU programming
Studies on CUDA Offloading for Real-Time Simulation and Visualization
Study and evaluation of an Irregular Graph Algorithm on Multicore and GPU Processor Architectures
Study and evaluation of improved automatic GPU offloading method
Study for measurement method for coal volume on base of GPU
Study of Bandwidth Partitioning for Co-executing GPU Kernels
Study of basic vector operations on Intel Xeon Phi and NVIDIA Tesla using OpenCL
Study of Convolution Algorithms using CPU and Graphics Hardware
Study of low density nuclear matter with quantum molecular dynamics: the role of the symmetry energy
Study of OpenCL Processing Models for FPGA Devices
Study of Sparse-Matrix Vector Multiplication (SpMV) on Different Architectures and Libraries
Study on acceleration technique for calculating near field of horn antenna based on GPU
Study on acceleration technique for two-dimensional FDTD algorithm based on GPU
Study on GPU-accelerated extraction of interconnects parasitic using CUDA and MPI
Study on semi-global matching algorithm extended for multi baseline matching and parallel processing method based on GPU
Study on volume rendering of CT slices based on ray casting
Study, Modelling and Implementation of the Level Set Method Used in Micromachining Processes
Studying the core-cusp problem in cold dark matter halos using N-body simulations on GPU clusters
Studying the Potential of Automatic Optimizations in the Intel FPGA SDK for OpenCL
Studying Thermal Management for Graphics-Processor Architectures
SU(2) Lattice Gauge Theory Simulations on Fermi GPUs
SU(2) Lattice QCD Simulations on Fermi GPUs
Sub-seasonal forecasting with a large ensemble of deep-learning weather prediction models
Subdivision Surface Evaluation as Sparse Matrix-Vector Multiplication
Subpixel reconstruction antialiasing for deferred shading
Suitability of NVIDIA GPUs for SKA1-Low
Super Earths and Dynamical Stability of Planetary Systems: First Parallel GPU Simulations Using GENGA
Supercomputing and stellar dynamics
Supercomputing with toys: harnessing the power of NVIDIA 8800GTX and playstation 3 for bioinformatics problem
Superconducting proximity effect in graphene under inhomogeneous strain
SUPERGLUE: A Shared Memory Framework Using Data Versioning for Dependency-Aware Task-Based Parallelization
SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks
SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks
Supervised Hashing with Deep Neural Networks
Support for Parallel Scan in OpenMP
Support Operator Rupture Dynamics on GPU
Support Vector Machines on GPU with Sparse Matrix Format
Supporting Applications Involving Dynamic Data Structures and Irregular Memory Access on Emerging Parallel Platforms
Supporting CUDA for an extended RISC-V GPU architecture
Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework
Supporting Heterogenous Computing Environments in SaC
Supporting input dependent access pattern algorithms on GPUs using GPUfs
Supporting Iteration in a Heterogeneous Data Flow Engine
Supporting mixed-datatype matrix multiplication within the BLIS framework
Supporting Preemptive Task Executions and Memory Copies in GPGPUs
Supporting x86-64 Address Translation for 100s of GPU Lanes
Surface Compression Using Dynamic Color Palettes
Surface Normal Integration for Convex Space-time Multi-view Reconstruction
Surface quality assessment of subdivision surfaces on programmable graphics hardware
Surface Reconstruction from Scattered Point via RBF Interpolation on GPU
Survey and Benchmarking of Machine Learning Accelerators
Survey of Domain-Specific Languages for FPGA Computing
Survey of GPU water simulation in game engine
Survey on Benchmarks for a GPU Based Multi Camera Stereo Matching Algorithm
Survey on Efficient Linear Solvers for Porous Media Flow Models on Recent Hardware Architectures
Survey On The Off-Chip Scheduling of Memory Accesses in the Memory Interface Of GPUs
Survey paper on Deep Learning on GPUs
Sustainable GPU Computing at Scale
Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale
SW# – GPU enabled exact alignments on genome scale
SW#db: GPU-accelerated exact sequence similarity database search
Swan: A tool for porting CUDA programs to OpenCL
SWAPHI: Smith-Waterman Protein Database Search on Xeon Phi Coprocessors
Swarm-NG: a CUDA Library for Parallel n-body Integrations with focus on Simulations of Planetary Systems
Swarm’s flight: Accelerating the particles using C-CUDA
swCaffe: a Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight
swCUDA: Auto parallel code translation framework from CUDA to ATHREAD for new generation sunway supercomputer
Swendsen-Wang Multi-Cluster Algorithm for the 2D/3D Ising Model on Xeon Phi and GPU
Swept Volume approximation of polygon soups
SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences
Titles: 100
open PDFs: 85
packages: 21