Papers on hgpu.org (.txt-file)
Scaling Monte Carlo Tree Search on Intel Xeon Phi
Scaling Multifluid Compressible Fluid Dynamics to 700,000 cores, 1.5 Pflop/s, and a Trillion Grid Cells
Scaling On-Device GPU Inference for Large Generative Models
Scaling Performance of FFT Computation on an Industrial Integrated GPU Co-processor: Experiments with Algorithm Adaptation
Scaling Radio Astronomy Signal Correlation on Heterogeneous Supercomputers Using Various Data Distribution Methodologies
Scaling Recurrent Neural Network Language Models
Scaling Results for a Discontinuous Galerkin Finite-Element Wave Solver on Multi-GPU Systems
Scaling Soft Matter Physics to Thousands of GPUs in Parallel
Scaling SU(2) to 1000 GPUs using HiRep
Scaling up scientific computations by using map-reduce-like control flow on NUMA architectures
Scaling-up spatially-explicit ecological models using graphics processors
SCALSALE: Scalable SALE Benchmark Framework for Supercomputers
Scan primitives for GPU computing
Scan Test Power Simulation on GPGPUs
Scandalously Parallelizable Mesh Generation
ScatterAlloc: Massively Parallel Dynamic Memory Allocation for the GPU
Scattering Parameters and Surface Normals from Homogeneous Translucent Materials using Photometric Stereo
Scattering Points in Parallel Coordinates
Scene Boundary Detection Technique Based on Bottom-Up Attention System and OpenCL Parallel Implementation
Scene image classfying via the Partially Connected Neural Network
Scene independent real-time indirect illumination
Scene Recognition Acceleration Using CUDA and OpenMP
SCF: a device- and language-independent task coordination framework for reconfigurable, heterogeneous systems
SCGPSim: A fast SystemC simulator on GPUs
Scheduling (ir)regular applications on heterogeneous platforms
Scheduling a Parallel Sparse Direct Solver to Multiple GPUs
Scheduling by Work-Stealing in Hybrid Parallel Architectures
Scheduling Computation Graphs of Deep Learning Models on Manycore CPUs
Scheduling data flow program in xkaapi: A new affinity based Algorithm for Heterogeneous Architectures
Scheduling Dataflow Execution Across Multiple Accelerators
Scheduling Deep Learning Jobs in Multi-Tenant GPU Clusters via Wise Resource Sharing
Scheduling for new computing platforms with GPUs
Scheduling Languages: A Past, Present, and Future Taxonomy
Scheduling of Linear Algebra Kernels on Multiple Heterogeneous Resources
Scheduling on Manycore and Heterogeneous Graphics Processors
Scheduling Parallel Tasks under Multiple Resources: List Scheduling vs. Pack Scheduling
Scheduling processing of real-time data streams on heterogeneous multi-GPU systems
Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System’s Perspective
SciAI4Industry – Solving PDEs for industry-scale problems with deep learning
Scientific and Engineering Computing Using ATI Stream Technology
Scientific computation for simulations on programmable graphics hardware
Scientific Computation on Graphics Processing Unit using CUDA
Scientific Computation Through a GPU
Scientific Computing on Heterogeneous Architectures
Scientific Computing on Hybrid Architectures
Scientific Computing Using Consumer Video-Gaming Hardware Devices
Scientific Computing with Python on GPUs
Scientific GPU Programming with Data-Flow Languages
Scientific Programming for Heterogeneous Systems – Bridging the Gap between Algorithms and Applications
Scientific Visualization in Astronomy: Towards the Petascale Astronomy Era
Scope for performance enhancement of CMU Sphinx by parallelising with OpenCL
Scope is all you need: Transforming LLMs for HPC Code
Scout: a data-parallel programming language for graphics processors
Seamless acceleration of Fortran intrinsics via AMD AI engines
Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio
Seamless GPU acceleration for C++ based physics with the Metal Shading Language on Apple’s M series unified chips
Searching CUDA code autotuning spaces with hardware performance counters: data from benchmarks running on various GPU architectures
Searching for a counterexample of Kurepa’s Conjecture
Searching for Concurrent Design Patterns in Video Games
Searching for sinks of Henon map using a multiple-precision GPU arithmetic library
Second Order Pre-Integrated Volume Rendering
Secret Key Cryptography Using Graphics Cards
Secure 3D graphics for virtual machines
Secure Distributed Computing on a Manycore Cloud
SecureMed: Secure Medical Computation using GPU-Accelerated Homomorphic Encryption Scheme
Securing GPU via Region-based Bounds Checking
Seeded ND medical image segmentation by cellular automaton on GPU
Seeing through the fog: an algorithm for fast and accurate touch detection in optical tabletop surfaces
Seer: Predictive Runtime Kernel Selection for Irregular Problems
Seismic Attributes Extraction Based on GPU
Seismic damage simulation for urban buildings based on high-performance GPU computing
Seismic imaging based on spectral differentiation matrix and GPU implementation
Seismic volume visualization for horizon extraction
Seismic Wave Propagation Simulation Using Accelerated Support Operator Rupture Dynamics on Multi-GPU
Seismic Wave Propagation Simulation Using Support Operator Method on multi-GPU system
Selecting the Best Tridiagonal System Solver Projected on Multi-Core CPU and GPU Platforms
Selection algorithm of graphic accelerators in heterogeneous cluster for optimization computing
Selection of Task Implementations in the Nanos++ Runtime
Self-Adapting Parallel Framework for Long-Term Object Tracking
Self-Adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures
Self-calibration of geometric and radiometric parameters for cone-beam computed tomography
self-CD: Interactive Self-collision Detection for Deformable Body Simulation Using GPUs
Self-Configuring Applications for Heterogeneous Systems: Program Composition and Optimization Using Cognitive Techniques
Self-Supervised Clustering for Codebook Construction: An Application to Object Localization
Self-Tuning Distribution of DB-Operations on Hybrid CPU/GPU Platforms
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
Semantic Pose using Deep Networks Trained on Synthetic RGB-D
Semantic Segmentation of Colon Glands with Deep Convolutional Neural Networks and Total Variation Segmentation
SemCache: Semantics-aware Caching for Efficient GPU Offloading
Semi-Analytic Solutions to the Radiative Transfer Equations via Hetergeneous Computing
Semi-Global Filtering of Airborne LiDAR Data for Fast Extraction of Digital Terrain Models
Semi-Global Matching-Motivation, Developments and Applications
Separable projection integrals for higher-order correlators of the cosmic microwave sky: Acceleration by factors exceeding 100
Separate Compilation in a Language-Integrated Heterogeneous Environment
Sequence alignment with GPU: Performance and design challenges
Sequence Data Indexing Method Exploiting the Parallel Processing Resources of GPGPU
Sequence Homology Search using Fine-Grained Cycle Sharing of Idle GPUs
Titles: 100
open PDFs: 95
packages: 12