Papers on hgpu.org (.txt-file)
Structured Orthogonal Inversion of Block p-Cyclic Matrices on Multicore with GPU Accelerators
STT-RAM for Shared Memory in GPUs
Studies Concerning the ATLAS IBL Calibration Architecture
Studies of quantum dots: Ab initio coupled-cluster analysis using OpenCL and GPU programming
Studies on CUDA Offloading for Real-Time Simulation and Visualization
Study and evaluation of an Irregular Graph Algorithm on Multicore and GPU Processor Architectures
Study and evaluation of improved automatic GPU offloading method
Study for measurement method for coal volume on base of GPU
Study of Bandwidth Partitioning for Co-executing GPU Kernels
Study of basic vector operations on Intel Xeon Phi and NVIDIA Tesla using OpenCL
Study of Convolution Algorithms using CPU and Graphics Hardware
Study of low density nuclear matter with quantum molecular dynamics: the role of the symmetry energy
Study of OpenCL Processing Models for FPGA Devices
Study of Sparse-Matrix Vector Multiplication (SpMV) on Different Architectures and Libraries
Study on acceleration technique for calculating near field of horn antenna based on GPU
Study on acceleration technique for two-dimensional FDTD algorithm based on GPU
Study on GPU-accelerated extraction of interconnects parasitic using CUDA and MPI
Study on semi-global matching algorithm extended for multi baseline matching and parallel processing method based on GPU
Study on volume rendering of CT slices based on ray casting
Study, Modelling and Implementation of the Level Set Method Used in Micromachining Processes
Studying the core-cusp problem in cold dark matter halos using N-body simulations on GPU clusters
Studying the Potential of Automatic Optimizations in the Intel FPGA SDK for OpenCL
Studying Thermal Management for Graphics-Processor Architectures
STuning-DL: Model-Driven Autotuning of Sparse GPU Kernels for Deep Learning
SU(2) Lattice Gauge Theory Simulations on Fermi GPUs
SU(2) Lattice QCD Simulations on Fermi GPUs
Sub-seasonal forecasting with a large ensemble of deep-learning weather prediction models
Subdivision Surface Evaluation as Sparse Matrix-Vector Multiplication
Subpixel reconstruction antialiasing for deferred shading
Suitability of NVIDIA GPUs for SKA1-Low
Super Earths and Dynamical Stability of Planetary Systems: First Parallel GPU Simulations Using GENGA
Supercharging Federated Learning with Flower and NVIDIA FLARE
Supercomputing and stellar dynamics
Supercomputing with toys: harnessing the power of NVIDIA 8800GTX and playstation 3 for bioinformatics problem
Superconducting proximity effect in graphene under inhomogeneous strain
SUPERGLUE: A Shared Memory Framework Using Data Versioning for Dependency-Aware Task-Based Parallelization
SUperman: Efficient Permanent Computation on GPUs
SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks
SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks
Superpipeline: A Universal Approach for Reducing GPU Memory Usage in Large Models
Supervised Hashing with Deep Neural Networks
Support for Parallel Scan in OpenMP
Support Operator Rupture Dynamics on GPU
Support Vector Machines on GPU with Sparse Matrix Format
Supporting Applications Involving Dynamic Data Structures and Irregular Memory Access on Emerging Parallel Platforms
Supporting CUDA for an extended RISC-V GPU architecture
Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework
Supporting Heterogenous Computing Environments in SaC
Supporting input dependent access pattern algorithms on GPUs using GPUfs
Supporting Iteration in a Heterogeneous Data Flow Engine
Supporting mixed-datatype matrix multiplication within the BLIS framework
Supporting Preemptive Task Executions and Memory Copies in GPGPUs
Supporting x86-64 Address Translation for 100s of GPU Lanes
Surface Compression Using Dynamic Color Palettes
Surface Normal Integration for Convex Space-time Multi-view Reconstruction
Surface quality assessment of subdivision surfaces on programmable graphics hardware
Surface Reconstruction from Scattered Point via RBF Interpolation on GPU
Survey and Benchmarking of Machine Learning Accelerators
Survey of Domain-Specific Languages for FPGA Computing
Survey of GPU water simulation in game engine
Survey of HPC in US Research Institutions
Survey on Benchmarks for a GPU Based Multi Camera Stereo Matching Algorithm
Survey on Efficient Linear Solvers for Porous Media Flow Models on Recent Hardware Architectures
Survey On The Off-Chip Scheduling of Memory Accesses in the Memory Interface Of GPUs
Survey paper on Deep Learning on GPUs
Sustainable GPU Computing at Scale
Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale
SW# – GPU enabled exact alignments on genome scale
SW#db: GPU-accelerated exact sequence similarity database search
Swan: A tool for porting CUDA programs to OpenCL
SWAPHI: Smith-Waterman Protein Database Search on Xeon Phi Coprocessors
Swarm-NG: a CUDA Library for Parallel n-body Integrations with focus on Simulations of Planetary Systems
Swarm’s flight: Accelerating the particles using C-CUDA
swCaffe: a Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight
swCUDA: Auto parallel code translation framework from CUDA to ATHREAD for new generation sunway supercomputer
Swendsen-Wang Multi-Cluster Algorithm for the 2D/3D Ising Model on Xeon Phi and GPU
Swept Volume approximation of polygon soups
SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences
Switching to High Gear: Opportunities for Grand-Scale Real-Time Parallel Simulations
Swizzle Inventor: Data Movement Synthesis for GPU Kernels
SWM: Simplified Wu-Manber for GPU-based Deep Packet Inspection
SWPS3 – fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and x86/SSE2
SYCL Code Generation for Multigrid Methods
SYCL compute kernels for ExaHyPE
SYCL in the edge: performance and energy evaluation for heterogeneous acceleration
SYCL in the Edge: Performance Evaluation for Heterogeneous Acceleration
SYCL-Bench 2020: Benchmarking SYCL 2020 on AMD, Intel, and NVIDIA GPUs
SYCL-Bench: A Versatile Cross-Platform Benchmark Suite for Heterogeneous Computing
SYCL-Bench: A Versatile Single-Source Benchmark Suite for Heterogeneous Computing
SYCLops: A SYCL Specific LLVM to MLIR Converter
Sylkan: Towards a Vulkan Compute Target Platform for SYCL
Symbolic Crosschecking of Data-Parallel Floating Point Code
Symbolic crosschecking of floating-point and SIMD code
Symbolic Differentiation in GPU Shaders
Symbolic Testing of OpenCL Code
Symphony: A Scheduler for Client-Server Applications on Coprocessor-based Heterogeneous Clusters
Synchronization and Coordination in Heterogeneous Processors
Synchronization and Ordering Semantics in Hybrid MPI+GPU Programming
Synergia CUDA: GPU-accelerated accelerator modeling package
Titles: 100
open PDFs: 91
packages: 32