1173

Papers on hgpu.org (.txt-file)

Origami: A Convolutional Network Accelerator Download

Orion: Interference-aware, Fine-grained GPU Sharing for ML Applications Download Package

Orthogonalization on a General Purpose Graphics Processing Unit with Double Double and Quad Double Arithmetic Download

Orthogononalization on a general purpose graphics processing unit with double double and quad double arithmetic Download

Orthorectification by Using GPGPU Method Download

Out of kernel tuning and optimizations for portable large-scale docking experiments on GPUs Download

Out-of-core cone beam reconstruction using multiple GPUs Download

Out-of-core Implementation for Accelerator Kernels on Heterogeneous Clouds Download Package

Out-of-core singular value decomposition Download

Out-of-core Training for Extremely Large-Scale Neural Networks With Adaptive Window-Based Scheduling Download Package

Out-of-the-box library support for DBMS operations on GPUs Download Package

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer Download

Over-synchronization in GPU Programs Download Package

Overcoming the GPU memory limitation on FDTD through the use of overlapping subgrids

Overcomplete Dictionary Learning with Jacobi Atom Updates Download

Overdetermined Shooting Methods for Computing Standing Water Waves with Spectral Accuracy Download

Overhauling SC atomics in C11 and OpenCL Download

Overlap fermions on GPUs Download

Overlapping Computation and Communication for Advection on Hybrid Parallel Computers Download

Overlapping computation and communication of three-dimensional FDTD on a GPU cluster Download

Overtaking CPU DBMSes with a GPU in Whole-Query Analytic Processing with Parallelism-Friendly Execution Plan Optimization Download

Overview of approaches for accelerating scale invariant feature detection algorithm

Overview of implementation of DARPA GPU program in SAIC

OWL: Cooperative Thread Array Aware Scheduling Techniques for Improving GPGPU Performance Download

Owl: Differential-based Side-Channel Leakage Detection for CUDA Applications Download Package

P-HGRMS: A Parallel Hypergraph Based Root Mean Square Algorithm for Image Denoising Download

P4OMP: Retrieval-Augmented Prompting for OpenMP Parallelism in Serial Code Download

PacketShader: a GPU-accelerated software router Download

Padding Free Bank Conflict Resolution for CUDA-Based Matrix Transpose Algorithm Download

Pairwise Sequence Alignment for Very Long Sequences on GPUs Download

Pairwise Sequence Alignment with Gaps with GPU Download

PAKCK: Performance and Power Analysis of Key Computational Kernels on CPUs and GPUs Download

Panda: A Compiler Framework for Concurrent CPU-GPU Execution of 3D Stencil Computations on GPU-accelerated Supercomputers Download

PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures Download

Pangaea: a tightly-coupled IA32 heterogeneous chip multiprocessor Download

Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU Download

PanJoin: A Partition-based Adaptive Stream Join Download

PANNA: Properties from Artificial Neural Network Architectures Download Package

Pannotia: Understanding Irregular GPGPU Graph Applications Download

PantaRay: fast ray-traced occlusion caching of massive scenes

PAPER – Accelerating parallel evaluations of ROCS Download Package

ParaCodex: A Profiling-Guided Autonomous Coding Agent for Reliable Parallel Code Generation and Translation Download Package

ParadisEO-MO-GPU: a Framework for Parallel GPU-based Local Search Metaheuristics Download Package

Paragon: Collaborative Speculative Loop Execution on GPU and CPU Download

ParaGraph: Weighted Graph Representation for Performance Optimization of HPC Kernels Download

Paraiso : An Automated Tuning Framework for Explicit Solvers of Partial Differential Equations Download Package

Parakeet: A Just-In-Time Parallel Accelerator for Python Download

Parallax: Automatic Data-Parallel Training of Deep Neural Networks Download Package

Paralleizing AwSpPCA for robust facial recognition using CUDA Download

Parallel 3D Fast Wavelet Transform comparison on CPUs and GPUs Download

Parallel 3D Finite Difference Time Domain Simulations on Graphics Processors with Cuda

Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster Download

Parallel 3D multigrid methods on the STI cell BE architecture Download

Parallel 5 point SOR for solving the Convection Diffusion equation using graphics processing units Download

Parallel acceleration of CPU and GPU range queries over large data sets Download

Parallel Acceleration on Manycore Systems and Its Performance Analysis: OpenCL Case Study Download

Parallel accelerators for GlimmerHMM bioinformatics algorithm

Parallel Actors and Learners: A Framework for Generating Scalable RL Implementations Download

Parallel AES algorithm for fast Data Encryption on GPU Download

Parallel AES Encryption Engines for Many-Core Processor Arrays Download

Parallel Agent systems on a GPU for use with Simulations and Games Download

Parallel Algorithm Design and Implementation of Regular/Irregular Problems: An In-depth Performance Study on Graphics Processing Units Download

Parallel Algorithm for BSDEs Based High Dimensional American Option Pricing on the GPU Download

Parallel Algorithm for Generation of Test Recommended Path using CUDA Download

Parallel Algorithm for GPU Processing; for use in High Speed Machine Vision Sensing of Cotton Lint Trash Download

Parallel Algorithm for Solving Kepler’s Equation on Graphics Processing Units: Application to Analysis of Doppler Exoplanet Searches Download Package

Parallel Algorithm of IDCT with GPUs and CUDA for Large-scale Video Quality of 3G Download

Parallel algorithms for approximation of distance maps on parametric surfaces Download

Parallel Algorithms for Constructing Data Structures for Fast Multipole Methods Download

Parallel Algorithms for Counting Problems on Graphs Using Graphics Processing Units Download

Parallel Algorithms for GPU accelerated Probabilistic Inference Download Package

Parallel Algorithms for Hybrid Multi-core CPU-GPU Implementations of Component Labelling in Critical Phase Models Download

Parallel algorithms for problems of cluster analysis with very large amount of data Download

Parallel Algorithms for the Summed Area Table on the Asynchronous Hierarchical Memory Machine, with GPU implementations Download

Parallel algorithms to a parallel hardware: Designing vision algorithms for a GPU Download

Parallel and Concurrent Programming in Haskell: Techniques for Multicore and Multithreaded Programming Download

Parallel and Distributed Deep Learning Download

Parallel and Distributed Implementations of Multiple and Two-Dimensional Pattern Matching Algorithms Download

Parallel and distributed seismic wave field modeling with combined Linux clusters and graphics processing units

Parallel and efficient Boolean on polygonal solids Download

Parallel and Heterogeneous Timing Analysis: Partition, Algorithm, and System Download Package

Parallel and Improved PageRank Algorithm for GPU-CPU Collaborative Environment Download

Parallel and in-process compilation of individuals for genetic programming on GPU Download Package

Parallel and Scalable Sparse Basic Linear Algebra Subprograms Download

Parallel ant colony for nonlinear function optimization with graphics hardware acceleration

Parallel Application Library for Object Recognition Download

Parallel Approach for Longest Common Subsequence problem on GPU Download

Parallel Approach for Time Series Analysis with General Regression Neural Networks Download

Parallel Approaches for SWAMP Sequence Alignment

Parallel Approaches to Edit Distance and Approximate String Matching Download

Parallel Approaches to Shortest-Path Problems for Multilevel Heterogeneous Computing Download

Parallel Arbitrary-precision Integer Arithmetic Download Package

Parallel Asynchronous Modelization and Execution of Cholesky Algorithm using Petri Nets Download

Parallel Banding Algorithm to compute exact distance transform with the GPU Download Package

Parallel Batch Training of the Self-Organizing Map Using OpenCL

Parallel Benefit on Different Programming Paradigms Download

Parallel Bio-Inspired Methods for Model Optimization and Pattern Recognition Download

Parallel birth and death process for cell nuclei extraction in histopathology images Download

Parallel Branch and Bound on a CPU-GPU System Download

Parallel Branch Prediction on GPU Platform Download

 

Brief statistics for this page

Titles: 100

Download open PDFs: 90

Package packages: 18

* * *

* * *

HGPU group © 2010-2026 hgpu.org

All rights belong to the respective authors

Contact us: