Papers on hgpu.org (.txt-file)
Optimising Purely Functional GPU Programs
Optimising Purely Functional GPU Programs (Thesis)
Optimising Reconfigurable Systems for Real-time Applications
Optimising the DBCSR GPU Implementation
Optimising Unstructured Mesh Computational Fluid Dynamics Applications on Multicores via Machine Learning and Code Transformation
Optimistic Parallelism on GPUs
Optimization and Evaluation of VLPL-S Particle-in-cell Code on Knights Landing
Optimization and Implementation of LBM Benchmark on Multithreaded GPU
Optimization and Large Scale Computation of an Entropy-Based Moment Closure
Optimization and Parallelization Methods for the Design of Next-Generation Radio Networks
Optimization and parallelization of B-spline based orbital evaluations in QMC on multi/many-core shared memory processors
Optimization and parameter exploration using GPU based FDTD solvers
Optimization and Portability of a Fusion OpenACC-based FORTRAN HPC Code from NVIDIA to AMD GPUs
Optimization of a discontinuous finite element solver with OpenCL and StarPU
Optimization of a discontinuous Galerkin solver with OpenCL and StarPU
Optimization of a FDTD code for graphical processing units
Optimization of a finite element code implemented in MATLAB: On the use of GPUs for High Performance Computing
Optimization of a GPU Implementation of Multi-Dimensional RF Pulse Design Algorithm
Optimization of a Machine Learning Algorithm on the Heterogeneous system using OpenCL
Optimization of Compiler-generated OpenCL CNN Kernels and Runtime for FPGAs
Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming
Optimization of Data-Parallel Scientific Applications on Highly Heterogeneous Modern HPC Platforms
Optimization of GPU workloads using natural language processing based on deep learning techniques
Optimization of HEP codes on GPUs
Optimization of Heterogeneous Parallel Computing Systems using Machine Learning
Optimization of Heterogeneous Systems with AI Planning Heuristics and Machine Learning: A Performance and Energy Aware Approach
Optimization of Hierarchical Matrix Computation on GPU
Optimization of Large-Scale Sparse Matrix-Vector Multiplication on Multi-GPU Systems
Optimization of Lattice Boltzmann Simulations on Heterogeneous Computers
Optimization of linked list prefix computations on multithreaded GPUs using CUDA
Optimization of mapped functions sequences using fusions on GPU
Optimization of massive data applications on heterogeneous architectures
Optimization of Molecular Dynamics Simulation Code and Applications to Biomolecular Systems
Optimization of OpenCL applications on FPGA
Optimization of parallel Genetic Algorithms for nVidia GPUs
Optimization of Pattern Matching Algorithms for Multi- and Many-Core Platforms
Optimization of Ported CFD Kernels on Intel Data Center GPU Max 1550 using oneAPI ESIMD
Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors
Optimization of RAID Erasure Coding Algorithms for Intel Xeon Phi
Optimization of real-time ultrasound PCIe data streaming and OpenCL processing for SAFT imaging
Optimization of solver for gas flow modeling
Optimization of Spatial Convolution in ConvNets on Intel KNL
Optimization of tele-immersion codes
Optimization of the Brillouin operator on the KNL architecture
Optimization of the Gaussian Mixture Model Evaluation on GPU
Optimization of the HEFT algorithm for a CPU-GPU environment
Optimization of the Oktay-Kronfeld Action Conjugate Gradient Inverter
Optimization of the Particle-based Volume Rendering for GPUs with Hiding Data Transfer Latency
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Optimization procedures during parallelization of specialized software for fluid flow simulations
Optimization Solutions for Improving the Performance of the Parallel Reduction Algorithm Using Graphics Processing Units
Optimization solutions for the segmented sum algorithmic function
Optimization strategies for parallel CPU and GPU implementations of a meshfree particle method
Optimization Techniques for CUDA Application
Optimization Techniques for GPU Programming
Optimization Techniques for Mapping Algorithms and Applications onto CUDA GPU Platforms and CPU-GPU Heterogeneous Platforms
Optimization Techniques on GPU: A Survey
Optimization, Specification and Verification of the Prefix Sum Program in an OpenCL Environment
Optimizations and Performance of a Robotics Grasping Algorithm Described in Geometric Algebra
Optimizations in Bioinformatics using GPU Processing on Binary Data
Optimize or Wait? Using llc Fast-Prototyping Tool to Evaluate CUDA Optimizations
Optimize Overall System Performance Through Workload Sequencing for GPUs Data Offloading
Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL?
Optimized Code Generation for Parallel and Polyhedral Loop Nests using MLIR
Optimized Composition: Generating Efficient Code for Heterogeneous Systems from Multi-Variant Components, Skeletons and Containers
Optimized Data Transfers Based on the OpenCL Event Management Mechanism
Optimized Deep Learning Architectures with Fast Matrix Operation Kernels on Parallel Platform
Optimized Event-Driven Runtime Systems for Programmability and Performance
Optimized GPU Framework for Pulsed Wave Doppler Ultrasound
Optimized GPU Framework for Speckle Reduction Using Histogram Matching and Region Growing
Optimized GPU Framework for Ultrasound B-Mode Imaging
Optimized GPU Framework for Ultrasound Color Flow Imaging
Optimized GPU Framework for Ultrasound Strain Imaging
Optimized GPU histograms for multi-modal registration
Optimized GPU Implementation and Performance Analysis of HC Series of Stream Ciphers
Optimized GPU simulation of continuous-spin glass models
Optimized HPL for AMD GPU and multi-core CPU usage
Optimized MFCC Feature Extraction on GPU
Optimized Parallel Implementation of Gillespie’s First Reaction Method on Graphics Processing Units
Optimized parallel implementation of pedestrian tracking using HOG features on GPU
Optimized Password Recovery for Encrypted RAR on GPUs
Optimized Pattern-Based Adaptive Mesh Refinement Using GPU
Optimized Private Information Retrieval Protocol Using Graphics Processing Unit With Reduced Accessibility
Optimized Strategies for Mapping Three-dimensional FFTs onto CUDA GPUs
Optimizing 3D Convolutions for Wavelet Transforms on CPUs with SSE Units and GPUs
Optimizing a Biomedical Imaging Orientation Score Framework
Optimizing a Hardware Network Stack to Realize an In-Network ML Inference Application
Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures
Optimizing a Near-duplicate Document Detection System with SIMD Technologies
Optimizing a Semantic Comparator using CUDA-enabled Graphics Hardware
Optimizing a shared virtual memory system for a heterogeneous CPU-accelerator platform
Optimizing All-to-All and Allgather Communications on GPGPU Clusters
Optimizing an OpenCL Application for Video Watermarking in FPGAs
Optimizing and Auto-tuning Belief Propagation on the GPU
Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures
Optimizing ASP.NET with C++ AMP on the GPU
Optimizing Block-Sparse Matrix Multiplications on CUDA with TVM
Optimizing Communication by Compression for Multi-GPU Scalable Breadth-First Searches
Optimizing Communication for Clusters of GPUs
Optimizing CUDA Code By Kernel Fusion – Application on BLAS
Titles: 100
open PDFs: 83
packages: 13