Papers on hgpu.org (.txt-file)
Scalable Clustering Using Graphics Processors
Scalable communication for high-order stencil computations using CUDA-aware MPI
Scalable Data Clustering using GPU Clusters
Scalable Dense Linear Algebra on Heterogeneous Hardware
Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation
Scalable Distributed Fast Multipole Methods
Scalable Fast Multipole Methods on Distributed Heterogeneous Architectures
Scalable Fast Multipole Methods on Heterogeneous Architecture
Scalable framework for mapping streaming applications onto multi-GPU systems
Scalable GPU Acceleration of B-Spline Signal Processing Operations
Scalable GPU rendering of CSG models
Scalable heterogeneous parallelism for atmospheric modeling and simulation
Scalable instruction set simulator for thousand-core architectures running on GPGPUs
Scalable Kernel Fusion for Memory-Bound GPU Applications
Scalable Lattice Boltzmann Solvers for CUDA GPU Clusters
Scalable learning for object detection with GPU hardware
Scalable Metropolis Monte Carlo for simulation of hard shapes
Scalable Molecular Dynamics Simulation Using FPGAs and Multicore Processors
Scalable Multi Agent Simulation on the GPU
Scalable Multi-Cache Simulation Using GPUs
Scalable Multi-GPU 3-D FFT for TSUBAME 2.0 Supercomputer
Scalable multi-GPU implementation of the MAGFLOW simulator
Scalable Multi-GPU Simulation of Long-Range Molecular Dynamics
Scalable packet classification via GPU metaprogramming
Scalable Parallel Minimum Spanning Forest Computation
Scalable parallel programming with CUDA
Scalable Parallel Tridiagonal Algorithms with Diagonal Pivoting and Their Optimization for Many-Core Architectures
Scalable Programming Models for Massively Multicore Processors
Scalable Query Evaluation in Relational Databases
Scalable Simulation of 3D Wave Propagation in Semi-Infinite Domains Using the Finite Difference Method on a GPU Based Cluster
Scalable Simulation of Tsunamis Generated by Submarine Landslides on GPU clusters
Scalable SMT-based verification of GPU kernel functions
Scalable Software Defined FM-radio receiver running on desktop computers
Scalable Solution of Radiative Heat Transfer Problems by the Photon Monte Carlo Algorithm on Hybrid Computing Architectures
Scalable Streaming Tools for Analyzing N-body Simulations: Finding Halos and Investigating Excursion Sets in One Pass
Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth
Scalable Techniques for Scheduling and Mapping DSP Applications onto Embedded Multiprocessor Platforms
Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay
Scalable Verification Techniques for Data-Parallel Programs
Scalable, High Performance Fourier Domain Optical Coherence Tomography: Why FPGAs and Not GPGPUs
Scalar collapse in AdS with an OpenCL open source code
Scale-dependent and example-based grayscale stippling
Scale-space ridge detection with GPU acceleration
Scaleable Sparse Matrix-Vector Multiplication with Functional Memory and GPUs
ScaleHLS: Scalable High-Level Synthesis through MLIR
Scaling behavior of topologically constrained polymer rings in a melt
Scaling Coupled Climate Models to Exascale: OpenACC-enabled ECEarth3 Earth System Model
Scaling CUDA for Distributed Heterogeneous Processors
Scaling Deep Learning on GPU and Knights Landing clusters
Scaling Deep Learning on Multiple In-Memory Processors
Scaling Fast Multipole Methods up to 4000 GPUs
Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer
Scaling Hierarchical N-body Simulations on GPU Clusters
Scaling High Performance Domain-Specific Language Implementation with Delite
Scaling IDS construction based on Non-negative Matrix factorization using GPU computing
Scaling LAPACK panel operations using parallel cache assignment
Scaling Lattice QCD beyond 100 GPUs
Scaling Monte Carlo Tree Search on Intel Xeon Phi
Scaling Multifluid Compressible Fluid Dynamics to 700,000 cores, 1.5 Pflop/s, and a Trillion Grid Cells
Scaling Performance of FFT Computation on an Industrial Integrated GPU Co-processor: Experiments with Algorithm Adaptation
Scaling Radio Astronomy Signal Correlation on Heterogeneous Supercomputers Using Various Data Distribution Methodologies
Scaling Recurrent Neural Network Language Models
Scaling Results for a Discontinuous Galerkin Finite-Element Wave Solver on Multi-GPU Systems
Scaling Soft Matter Physics to Thousands of GPUs in Parallel
Scaling up scientific computations by using map-reduce-like control flow on NUMA architectures
Scaling-up spatially-explicit ecological models using graphics processors
SCALSALE: Scalable SALE Benchmark Framework for Supercomputers
Scan primitives for GPU computing
Scan Test Power Simulation on GPGPUs
Scandalously Parallelizable Mesh Generation
ScatterAlloc: Massively Parallel Dynamic Memory Allocation for the GPU
Scattering Parameters and Surface Normals from Homogeneous Translucent Materials using Photometric Stereo
Scattering Points in Parallel Coordinates
Scene Boundary Detection Technique Based on Bottom-Up Attention System and OpenCL Parallel Implementation
Scene image classfying via the Partially Connected Neural Network
Scene independent real-time indirect illumination
Scene Recognition Acceleration Using CUDA and OpenMP
SCF: a device- and language-independent task coordination framework for reconfigurable, heterogeneous systems
SCGPSim: A fast SystemC simulator on GPUs
Scheduling (ir)regular applications on heterogeneous platforms
Scheduling a Parallel Sparse Direct Solver to Multiple GPUs
Scheduling by Work-Stealing in Hybrid Parallel Architectures
Scheduling Computation Graphs of Deep Learning Models on Manycore CPUs
Scheduling data flow program in xkaapi: A new affinity based Algorithm for Heterogeneous Architectures
Scheduling Dataflow Execution Across Multiple Accelerators
Scheduling Deep Learning Jobs in Multi-Tenant GPU Clusters via Wise Resource Sharing
Scheduling for new computing platforms with GPUs
Scheduling Languages: A Past, Present, and Future Taxonomy
Scheduling of Linear Algebra Kernels on Multiple Heterogeneous Resources
Scheduling on Manycore and Heterogeneous Graphics Processors
Scheduling Parallel Tasks under Multiple Resources: List Scheduling vs. Pack Scheduling
Scheduling processing of real-time data streams on heterogeneous multi-GPU systems
Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System’s Perspective
SciAI4Industry – Solving PDEs for industry-scale problems with deep learning
Scientific and Engineering Computing Using ATI Stream Technology
Scientific computation for simulations on programmable graphics hardware
Scientific Computation on Graphics Processing Unit using CUDA
Scientific Computation Through a GPU
Titles: 100
open PDFs: 89
packages: 13