Papers on hgpu.org (.txt-file)
Optimising Unstructured Mesh Computational Fluid Dynamics Applications on Multicores via Machine Learning and Code Transformation

Optimistic Parallelism on GPUs

Optimization and Evaluation of VLPL-S Particle-in-cell Code on Knights Landing

Optimization and Implementation of LBM Benchmark on Multithreaded GPU
Optimization and Large Scale Computation of an Entropy-Based Moment Closure

Optimization and Parallelization Methods for the Design of Next-Generation Radio Networks

Optimization and parallelization of B-spline based orbital evaluations in QMC on multi/many-core shared memory processors

Optimization and parameter exploration using GPU based FDTD solvers
Optimization and Portability of a Fusion OpenACC-based FORTRAN HPC Code from NVIDIA to AMD GPUs

Optimization of a discontinuous finite element solver with OpenCL and StarPU

Optimization of a discontinuous Galerkin solver with OpenCL and StarPU

Optimization of a FDTD code for graphical processing units
Optimization of a finite element code implemented in MATLAB: On the use of GPUs for High Performance Computing

Optimization of a GPU Implementation of Multi-Dimensional RF Pulse Design Algorithm
Optimization of a Machine Learning Algorithm on the Heterogeneous system using OpenCL

Optimization of Compiler-generated OpenCL CNN Kernels and Runtime for FPGAs

Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming

Optimization of Data-Parallel Scientific Applications on Highly Heterogeneous Modern HPC Platforms

Optimization of GPU workloads using natural language processing based on deep learning techniques

Optimization of HEP codes on GPUs

Optimization of Heterogeneous Parallel Computing Systems using Machine Learning

Optimization of Heterogeneous Systems with AI Planning Heuristics and Machine Learning: A Performance and Energy Aware Approach

Optimization of Hierarchical Matrix Computation on GPU

Optimization of Large-Scale Sparse Matrix-Vector Multiplication on Multi-GPU Systems

Optimization of Lattice Boltzmann Simulations on Heterogeneous Computers

Optimization of linked list prefix computations on multithreaded GPUs using CUDA

Optimization of mapped functions sequences using fusions on GPU

Optimization of massive data applications on heterogeneous architectures

Optimization of Molecular Dynamics Simulation Code and Applications to Biomolecular Systems

Optimization of OpenCL applications on FPGA

Optimization of parallel Genetic Algorithms for nVidia GPUs
Optimization of Pattern Matching Algorithms for Multi- and Many-Core Platforms

Optimization of Ported CFD Kernels on Intel Data Center GPU Max 1550 using oneAPI ESIMD

Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors

Optimization of RAID Erasure Coding Algorithms for Intel Xeon Phi

Optimization of real-time ultrasound PCIe data streaming and OpenCL processing for SAFT imaging

Optimization of solver for gas flow modeling

Optimization of Spatial Convolution in ConvNets on Intel KNL

Optimization of tele-immersion codes

Optimization of the Brillouin operator on the KNL architecture

Optimization of the Gaussian Mixture Model Evaluation on GPU

Optimization of the HEFT algorithm for a CPU-GPU environment

Optimization of the Oktay-Kronfeld Action Conjugate Gradient Inverter

Optimization of the Particle-based Volume Rendering for GPUs with Hiding Data Transfer Latency

Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Optimization procedures during parallelization of specialized software for fluid flow simulations

Optimization Solutions for Improving the Performance of the Parallel Reduction Algorithm Using Graphics Processing Units

Optimization solutions for the segmented sum algorithmic function

Optimization strategies for parallel CPU and GPU implementations of a meshfree particle method

Optimization Techniques for CUDA Application

Optimization Techniques for GPU Programming

Optimization Techniques for Mapping Algorithms and Applications onto CUDA GPU Platforms and CPU-GPU Heterogeneous Platforms

Optimization Techniques on GPU: A Survey
Optimization, Specification and Verification of the Prefix Sum Program in an OpenCL Environment

Optimizations and Performance of a Robotics Grasping Algorithm Described in Geometric Algebra

Optimizations in Bioinformatics using GPU Processing on Binary Data
Optimize or Wait? Using llc Fast-Prototyping Tool to Evaluate CUDA Optimizations
Optimize Overall System Performance Through Workload Sequencing for GPUs Data Offloading

Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL?

Optimized Code Generation for Parallel and Polyhedral Loop Nests using MLIR

Optimized Composition: Generating Efficient Code for Heterogeneous Systems from Multi-Variant Components, Skeletons and Containers

Optimized Data Transfers Based on the OpenCL Event Management Mechanism

Optimized Deep Learning Architectures with Fast Matrix Operation Kernels on Parallel Platform

Optimized Event-Driven Runtime Systems for Programmability and Performance

Optimized GPU Framework for Pulsed Wave Doppler Ultrasound
Optimized GPU Framework for Speckle Reduction Using Histogram Matching and Region Growing
Optimized GPU Framework for Ultrasound B-Mode Imaging
Optimized GPU Framework for Ultrasound Color Flow Imaging
Optimized GPU Framework for Ultrasound Strain Imaging
Optimized GPU histograms for multi-modal registration
Optimized GPU Implementation and Performance Analysis of HC Series of Stream Ciphers

Optimized GPU simulation of continuous-spin glass models

Optimized HPL for AMD GPU and multi-core CPU usage
Optimized MFCC Feature Extraction on GPU

Optimized Parallel Implementation of Gillespie’s First Reaction Method on Graphics Processing Units

Optimized parallel implementation of pedestrian tracking using HOG features on GPU
Optimized Password Recovery for Encrypted RAR on GPUs

Optimized Pattern-Based Adaptive Mesh Refinement Using GPU

Optimized Private Information Retrieval Protocol Using Graphics Processing Unit With Reduced Accessibility

Optimized Strategies for Mapping Three-dimensional FFTs onto CUDA GPUs

Optimizing 3D Convolutions for Wavelet Transforms on CPUs with SSE Units and GPUs

Optimizing a Biomedical Imaging Orientation Score Framework

Optimizing a Hardware Network Stack to Realize an In-Network ML Inference Application

Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures

Optimizing a Near-duplicate Document Detection System with SIMD Technologies

Optimizing a Semantic Comparator using CUDA-enabled Graphics Hardware

Optimizing a shared virtual memory system for a heterogeneous CPU-accelerator platform
Optimizing All-to-All and Allgather Communications on GPGPU Clusters

Optimizing an OpenCL Application for Video Watermarking in FPGAs

Optimizing and Auto-tuning Belief Propagation on the GPU

Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures

Optimizing ASP.NET with C++ AMP on the GPU

Optimizing Block-Sparse Matrix Multiplications on CUDA with TVM

Optimizing Communication by Compression for Multi-GPU Scalable Breadth-First Searches

Optimizing Communication for Clusters of GPUs

Optimizing CUDA Code By Kernel Fusion – Application on BLAS

Optimizing CUDA Shared Memory Usage

Optimizing data intensive GPGPU computations for DNA sequence alignment

Optimizing Data Locality for Iterative Matrix Solvers on CUDA

Optimizing Data Warehousing Applications for GPUs Using Kernel Fusion/Fission

Titles: 100
open PDFs: 83
packages: 11
