Papers on hgpu.org (.txt-file)
OMB-Py: Python Micro-Benchmarks for Evaluating Performance of MPI Libraries on HPC Systems
OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures
Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs
OMP2HMPP: Compiler Framework for Energy-Performance Trade-off Analysis of Automatically Generated Codes
OMP2HMPP: HMPP Source Code Generation from Programs with Pragma Extensions
On a Simplified Approach to Achieve Parallel Performance and Portability Across CPU and GPU Architectures
On algorithmic reductions in task-parallel programming models
On Benchmarking the Matrix Multiplication Algorithm using OpenMP, MPI and CUDA Programming Languages
On Binaural Spatialization and the Use of GPGPU for Audio Processing
On continuous maximum flow image segmentation algorithm
On CUDA implementation of a multichannel room impulse response reshaping algorithm based on p-norm optimization
On Demand Solid Texture Synthesis Using Deep 3D Networks
On Development, Feasibility, and Limits of Highly Efficient CPU and GPU Programs in Several Fields
On Dynamic Load Balancing on Graphics Processors
On Efficient GPGPU Computing for Integrated Heterogeneous CPU-GPU Microprocessors
On Expressing Different Concurrency Paradigms on Virtual Execution Systems
On Expressing Different Concurrency Paradigms on Virtual Execution Systems (thesis)
On GPU Fourier Transformations
On GPU-Accelerated Fast Direct Solvers and Their Applications in Image Denoising
On GPU’s viability as a middleware accelerator
On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest
On learning optimized reaction diffusion processes for effective image restoration
On Leveraging GPUs for Security: discussing k-anonymity and pattern matching
On Longest Repeat Queries Using GPU
On Migration and Consolidation of VMs in Hybrid CPU-GPU Environments
On modelling of anisotropic viscoelasticity for soft tissue simulation: numerical solution and GPU execution
On optimization of finite-difference time-domain (FDTD) computation on heterogeneous and GPU clusters
On optimization techniques for the matrix multiplication on hybrid CPU+GPU platforms
On Optimizing Complex Stencils on GPUs
On Parallel Software Verification using Boolean Equation Systems
On Password Guessing with GPUs and FPGAs
On Performance of GPU and DSP Architectures for Computationally Intensive Applications
On Pre-Trained Image Features and Synthetic Images for Deep Learning
On Reinforcement Learning for Full-length Game of StarCraft
On Runtime Systems for Task-based Programming on Heterogeneous Platforms
On Scheduling Ring-All-Reduce Learning Jobs in Multi-Tenant GPU Clusters with Communication Contention
On Simplifying and Optimizing Programs for Heterogeneous Computing Systems
On sorting and load balancing on GPUs
On Static Timing Analysis of GPU Kernels
On testing GPU memory for hard and soft errors
On the Accelerating of Two-dimensional Smart Laplacian Smoothing on the GPU
On the accuracy and performance of the lattice Boltzmann method with 64-bit, 32-bit and novel 16-bit number formats
On the Characterization of OpenCL Dwarfs on Fixed and Reconfigurable Platforms
On the Choice of Tensor Estimation for Corner Detection, Optical Flow and Denoising
On the Compilation Performance of Current SYCL Implementations
On the Correctness of the SIMT Execution Model of GPUs
On the Cryptanalysis of Public-Key Cryptography
On the design of architecture-aware algorithms for emerging applications
On the design of sparse hybrid linear solvers for modern parallel architectures
On the Development and Implementation of High-Order Flux Reconstruction Schemes for Computational Fluid Dynamics
On the Effect of Using Multiple GPUs in Solving QAPs with CUDA
On the Effectiveness of OpenMP teams for Programming Embedded Manycore Accelerators
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
On the Efficacy of GPU-Integrated MPI for Scientific Applications
On the Efficiency of CPU and Hybrid CPU-GPU Systems in Computational Biology Tasks
On the efficiency of iterative ordered subset reconstruction algorithms for acceleration on GPUs
On the energy efficiency of graphics processing units for scientific computing
On the evaluation of matrix polynomials using several GPGPUs
On the Fly Porn Video Blocking Using Distributed Multi-GPU and Data Mining Approach
On the GPGPU parallelization issues of finite element approximate inverse preconditioning
On the limits of GPU acceleration
On the numerical sensitivity of computer simulations on hybrid and parallel computing systems
On the numerical solution of chaotic dynamical systems using extend precision floating point arithmetic and very high order numerical methods
On the origin of yet another channel
On the Parallelization of Integer Polynomial Multiplication
On the Partitioning of GPU Power among Multi-Instances
On the Performance and Energy-efficiency of Multi-core SIMD CPUs and CUDA-enabled GPUs
On the performance of a highly-scalable Computational Fluid Dynamics code on AMD, ARM and Intel processors
On the performance of GPU public-key cryptography
On the Performance Portability of Structured Grid Codes on Many-Core Computer Architectures
On the Portability of CPU-Accelerated Applications via Automated Source-to-Source Translation
On the Portability of GPU-Accelerated Applications via Automated Source-to-Source Translation
On the Portability of the OpenCL Dwarfs on Fixed and Reconfigurable Parallel Platforms
On the Programmability and Performance of Heterogeneous Platforms
On the programmability of multi-GPU computing systems
On the Relation between Anisotropic Diffusion and Iterated Adaptive Filtering
On the Representation of Partially Specified Implementations and its Application to the Optimization of Linear Algebra Kernels on GPU
On the Robust Mapping of Dynamic Programming onto a Graphics Processing Unit
On the Simulations of Evolution-Communication P Systems with Energy without Antiport Rules for GPUs
On the technology roadmap of Free-Viewpoint 3DTV receivers
On the Three P’s of Parallel Programming for Heterogeneous Computing: Performance, Productivity, and Portability
On the type of the temperature phase transition in phi-4 model
On the Usage of GPUs for Efficient Motion Estimation in Medical Image Sequences
On the Use of a GPU-Accelerated Mobile Device Processor for Sound Source Localization
On the Use of an Algebraic Language Interface for Waveform Definition
On the use of deep Boltzmann machines for road signs classification
On the Use of GPUs in Realizing Cost-Effective Distributed RAID
On the Use of Graphic Processing Units for the Efficient Implementation of MIMO Detectors
On the Use of Graphics Processing Units (GPUs) for Molecular Dynamics Simulation of Spherical Particles
On the Use of Remote GPUs and Low-Power Processors for the Acceleration of Scientific Applications
On the Use of Small 2D Convolutions on GPUs
On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods
On the Validation and Applications of a Parallel Flexible Multi-Body Dynamics Implementation
On the Visualization of Social and other Scale-Free Networks
On the Way to Future’s High Energy Particle Physics Transport Code
On Using GPU to Compute Options and Derivatives
On Vectorization of Deep Convolutional Neural Networks for Vision Tasks
On-Demand Generating and Scheduling Optimised Parallel Applications on Heterogeneous Platforms
Titles: 100
open PDFs: 96
packages: 16