Papers on hgpu.org (.txt-file)
Parallelization Strategies of the Canny Edge Detector for Multi-core CPUs and Many-core GPUs

Parallelization techniques of the x264 video encoder

Parallelization the Job-shop Problem on Distributed and Shared Memory Architectures

Parallelization with Different API on Multicore Architecture

Parallelization, Scalability, and Reproducibility in Next-Generation Sequencing Analysis

Parallelize L-BFGS-B on the GPU

Parallelized agent-based simulation on CPU and graphics hardware for spatial and stochastic models in biology

Parallelized generation of photon texture and real-time rendering on GPU
Parallelized Hierarchical Expected Matching Probability for Multiple Sequence Alignment

Parallelized Incomplete Poisson Preconditioner in Cloth Simulation

Parallelized Kendall’s Tau Coefficient Computation via SIMD Vectorized Sorting On Many-Integrated-Core Processors

Parallelized Local Volatility Estimation Using GP-GPU Hardware Acceleration
Parallelized Physical Optics computations for Scattering Center Models in radio channel simulations
Parallelized Seeded Region Growing using CUDA

Parallelized Segmentation of CT-Angiography datasets using CUDA

Parallelized Vlasov-Fokker-Planck solver for desktop personal computers

Parallelizing a high-order WENO scheme for complicated flow structures on GPU and MIC

Parallelizing AES on multicores and GPUs

Parallelizing Alternating Direction Implicit Solver on GPUs

Parallelizing compiler framework and API for power reduction and software productivity of real-time heterogeneous multicores

Parallelizing Exact and Approximate String Matching via Inclusive Scan on a GPU

Parallelizing flow-accumulation calculations on graphics processing units – From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

Parallelizing FPGA Technology Mapping Using Graphics Processing Units (GPUs)
Parallelizing fuzzy rule generation using GPGPU

Parallelizing General Histogram Application for CUDA Architectures

Parallelizing Kernel Polynomial Method Applying Graphics Processing Units

Parallelizing LINQ Program for GPGPU
Parallelizing Map Projection of Raster Data on Multi-core CPU and GPU Parallel Programming Frameworks

Parallelizing Motion JPEG 2000 with CUDA
Parallelizing Multicore Cache Simulations using Heterogeneous Computing on General Purpose and Graphics Processors

Parallelizing Multiple Flow Accumulation Algorithm using CUDA and OpenACC

Parallelizing of digital signal processing with using GPU

Parallelizing Peptide-Spectrum scoring using modern graphics processing units
Parallelizing Simulated Annealing-Based Placement Using GPGPU

Parallelizing the cellular potts model on GPU and multi-core CPU: An OpenCL cross-platform study

Parallelizing the Cellular Potts Model on graphics processing units

Parallelizing the Edge application for GPU-based systems using the SkePU skeleton programming library

Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics

Parallelizing Word2Vec in Multi-Core and Many-Core Architectures

Parallelizing Word2Vec in Shared and Distributed Memory

ParallelKittens: Systematic and Practical Simplification of Multi-GPU AI Kernels

Parameter Selection and Pre-Conditioning for a Graph Form Solver

Parameter Tuning of a Hybrid Treecode-FMM on GPUs

Parameterized Verification of GPU Kernel Programs

Parametric Flows: Automated Behavior Equivalencing for Symbolic Analysis of Races in CUDA Programs

Parametric GPU Code Generation for Affine Loop Programs

Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing

ParEval-Repo: A Benchmark Suite for Evaluating LLMs with Repository-level HPC Translation Tasks

PARIS: A Parallel RSA-Prime Inspection Tool

Parle: parallelizing stochastic gradient descent

ParPaRaw: Massively Parallel Parsing of Delimiter-Separated Raw Data

PARRAY: A Unifying Array Representation for Heterogeneous Parallelism

Parsing in Parallel on Multiple Cores and GPUs

Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network

ParTeCL: parallel testing using OpenCL

Partial Demosaicing for Stereo Matching of CFA Images on GPU and CPU

Partial Parallelization of the Successive Projections Algorithm using Compute Unified Device Architecture

Partial Volume Effect Correction using Anisotropic Backward Diffusion

Partial wave analysis at BES III harnessing the power of GPUs

Partial Wave Analysis using Graphics Cards

PartialRC: A Partial Recomputing Method for Efficient Fault Recovery on GPGPUs

Particle and texture based spatiotemporal visualization of time-dependent vector fields

Particle filter on GPUs for real-time tracking

Particle filtering with rendered models: A two pass approach to multi-object 3D tracking with the GPU

Particle Filters on Multi-Core Processors

Particle Level Set Advection for the Interactive Visualization of Unsteady 3D Flow

Particle Simulation on a GPU with PyCUDA

Particle Swarm Optimization of Model Parameters: Simulation of Deep Reactive Ion Etching by the Continuous Cellular Automaton

Particle-Based Fluid Simulation on the GPU

Particle-Based Multiple Irregular Volume Rendering on CUDA
Particle-based Visualization of Large Cosmological Datasets

Particle-based volume rendering
Particle-in-cell algorithms for plasma simulations on heterogeneous architectures

Particle-in-Cell Laser-Plasma Simulation on Xeon Phi Coprocessors

Particle-in-cell Simulations with Charge-Conserving Current Deposition on Graphic Processing Units
Partitioned Memory Parallel Programming Framework

Partitioning Large Scale Deep Belief Networks Using Dropout

Partitioning streaming parallelism for multi-cores: a machine learning based approach
Pass a Pointer: Exploring Shared Virtual Memory Abstractions in OpenCL Tools for FPGAs

PASSATA – Object oriented numerical simulation software for adaptive optics

Passive-Active Geometric Calibration for View-Dependent Projections onto Arbitrary Surfaces

Password Cracking in the Cloud

Password recovery for encrypted ZIP archives using GPUs

Password Recovery for RAR Files Using CUDA
Password Recovery Using MPI and CUDA

Patch-Based Image Vectorization with Automatic Curvilinear Feature Alignment

Path Integral Approaches and Graphics Processing Unit Tools for Quantum Molecular Dynamics Simulations

Pathological Image Analysis Using the GPU: Stroma Classification for Neuroblastoma
Pathological image segmentation for neuroblastoma using the GPU

Pattern Matching in OpenCL: GPU vs CPU Energy Consumption on Two Mobile Chipsets

Pattern Recognition with Embedded Systems Technology: A Survey
Pattern Recognition with OpenCL Heterogeneous Platform

Pattern-based Programming Abstractions for Heterogeneous Parallel Computing

Patterns and Rewrite Rules for Systematic Code Generation (From High-Level Functional Patterns to High-Performance OpenCL Code)

Patterns of Inefficient Performance Behavior in GPU Applications

PATUS: A Code Generation and Auto-Tuning Framework For Parallel Stencil Computations

PATUS: A Code Generation and Autotuning Framework For Parallel Iterative Stencil Computations on Modern Microarchitectures

Titles: 100
open PDFs: 84
packages: 20
