Papers on hgpu.org (.txt-file)
Scalable Streaming Tools for Analyzing N-body Simulations: Finding Halos and Investigating Excursion Sets in One Pass

Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth

Scalable Techniques for Scheduling and Mapping DSP Applications onto Embedded Multiprocessor Platforms

Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay

Scalable Verification Techniques for Data-Parallel Programs

Scalable, High Performance Fourier Domain Optical Coherence Tomography: Why FPGAs and Not GPGPUs

Scalar collapse in AdS with an OpenCL open source code

SCALE-Ahead-Of-Time Compilation of CUDA for AMD GPUs

Scale-dependent and example-based grayscale stippling
Scale-space ridge detection with GPU acceleration
Scaleable Sparse Matrix-Vector Multiplication with Functional Memory and GPUs
ScaleHLS: Scalable High-Level Synthesis through MLIR

Scaling behavior of topologically constrained polymer rings in a melt

Scaling Coupled Climate Models to Exascale: OpenACC-enabled ECEarth3 Earth System Model

Scaling CUDA for Distributed Heterogeneous Processors

Scaling Deep Learning on GPU and Knights Landing clusters

Scaling Deep Learning on Multiple In-Memory Processors

Scaling Fast Multipole Methods up to 4000 GPUs

Scaling GPU-Accelerated Databases beyond GPU Memory Size

Scaling GPU-to-CPU Migration for Efficient Distributed Execution on CPU Clusters

Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer

Scaling Hierarchical N-body Simulations on GPU Clusters

Scaling High Performance Domain-Specific Language Implementation with Delite

Scaling IDS construction based on Non-negative Matrix factorization using GPU computing

Scaling LAPACK panel operations using parallel cache assignment

Scaling Lattice QCD beyond 100 GPUs

Scaling Monte Carlo Tree Search on Intel Xeon Phi

Scaling Multifluid Compressible Fluid Dynamics to 700,000 cores, 1.5 Pflop/s, and a Trillion Grid Cells

Scaling On-Device GPU Inference for Large Generative Models

Scaling Performance of FFT Computation on an Industrial Integrated GPU Co-processor: Experiments with Algorithm Adaptation

Scaling Radio Astronomy Signal Correlation on Heterogeneous Supercomputers Using Various Data Distribution Methodologies

Scaling Recurrent Neural Network Language Models

Scaling Results for a Discontinuous Galerkin Finite-Element Wave Solver on Multi-GPU Systems

Scaling Soft Matter Physics to Thousands of GPUs in Parallel

Scaling SU(2) to 1000 GPUs using HiRep

Scaling up scientific computations by using map-reduce-like control flow on NUMA architectures

Scaling-up spatially-explicit ecological models using graphics processors

SCALSALE: Scalable SALE Benchmark Framework for Supercomputers

Scan primitives for GPU computing

Scan Test Power Simulation on GPGPUs

Scandalously Parallelizable Mesh Generation

ScatterAlloc: Massively Parallel Dynamic Memory Allocation for the GPU

Scattering Parameters and Surface Normals from Homogeneous Translucent Materials using Photometric Stereo

Scattering Points in Parallel Coordinates

Scene Boundary Detection Technique Based on Bottom-Up Attention System and OpenCL Parallel Implementation

Scene image classfying via the Partially Connected Neural Network
Scene independent real-time indirect illumination

Scene Recognition Acceleration Using CUDA and OpenMP
SCF: a device- and language-independent task coordination framework for reconfigurable, heterogeneous systems

SCGPSim: A fast SystemC simulator on GPUs

Scheduling (ir)regular applications on heterogeneous platforms

Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Scheduling by Work-Stealing in Hybrid Parallel Architectures

Scheduling Computation Graphs of Deep Learning Models on Manycore CPUs

Scheduling data flow program in xkaapi: A new affinity based Algorithm for Heterogeneous Architectures

Scheduling Dataflow Execution Across Multiple Accelerators

Scheduling Deep Learning Jobs in Multi-Tenant GPU Clusters via Wise Resource Sharing

Scheduling for new computing platforms with GPUs

Scheduling Languages: A Past, Present, and Future Taxonomy

Scheduling of Linear Algebra Kernels on Multiple Heterogeneous Resources

Scheduling on Manycore and Heterogeneous Graphics Processors

Scheduling Parallel Tasks under Multiple Resources: List Scheduling vs. Pack Scheduling

Scheduling processing of real-time data streams on heterogeneous multi-GPU systems

Scheduling Tasks over Multicore machines enhanced with Accelerators: a Runtime System’s Perspective

SciAI4Industry – Solving PDEs for industry-scale problems with deep learning

SciDef: Automating Definition Extraction from Academic Literature with Large Language Models

Scientific and Engineering Computing Using ATI Stream Technology

Scientific computation for simulations on programmable graphics hardware

Scientific Computation on Graphics Processing Unit using CUDA

Scientific Computation Through a GPU
Scientific Computing on Heterogeneous Architectures

Scientific Computing on Hybrid Architectures

Scientific Computing Using Consumer Video-Gaming Hardware Devices

Scientific Computing with Python on GPUs

Scientific GPU Programming with Data-Flow Languages

Scientific Programming for Heterogeneous Systems – Bridging the Gap between Algorithms and Applications

Scientific Visualization in Astronomy: Towards the Petascale Astronomy Era

Scope for performance enhancement of CMU Sphinx by parallelising with OpenCL

Scope is all you need: Transforming LLMs for HPC Code

Scout: a data-parallel programming language for graphics processors
Seamless acceleration of Fortran intrinsics via AMD AI engines

Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio

Seamless GPU acceleration for C++ based physics with the Metal Shading Language on Apple’s M series unified chips

Searching CUDA code autotuning spaces with hardware performance counters: data from benchmarks running on various GPU architectures

Searching for a counterexample of Kurepa’s Conjecture

Searching for Concurrent Design Patterns in Video Games

Searching for sinks of Henon map using a multiple-precision GPU arithmetic library

Second Order Pre-Integrated Volume Rendering

Secret Key Cryptography Using Graphics Cards

Secure 3D graphics for virtual machines

Secure Distributed Computing on a Manycore Cloud

SecureMed: Secure Medical Computation using GPU-Accelerated Homomorphic Encryption Scheme

Securing GPU via Region-based Bounds Checking

Seeded ND medical image segmentation by cellular automaton on GPU

SeedFold: Scaling Biomolecular Structure Prediction

Seeing through the fog: an algorithm for fast and accurate touch detection in optical tabletop surfaces

Seer: Predictive Runtime Kernel Selection for Irregular Problems

Seismic Attributes Extraction Based on GPU

Titles: 100
open PDFs: 92
packages: 21
