Papers on hgpu.org (.txt-file)
MSREP: A Fast yet Light Sparse Matrix Framework for Multi-GPU Systems

MSTg: Cryptographically strong pseudorandom number generator and its realization

MT4G: A Tool for Reliable Auto-Discovery of NVIDIA and AMD GPU Compute and Memory Topologies

mu-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching

mu-grind: A Framework for Dynamically Instrumenting HLS-Generated RTL

Multi Agent Navigation on the GPU

Multi GPU Implementation of Iterative Tomographic Reconstruction Algorithms

Multi GPU Implementation of the Simplex Algorithm

Multi GPU Performance of Conjugate Gradient Algorithm with Staggered Fermions

Multi GPU Performance of Conjugate Gradient Solver with Staggered Fermions in Mixed Precision

Multi scale block histogram of template feature for pedestrian detection
Multi- and many-core data mining with adaptive sparse grids

Multi-Agent Systems and General-Purpose Computing on Graphics Processing Units: A Survey

Multi-agent traffic simulation with CUDA

Multi-camera real-time depth estimation with discontinuity handling on PC graphics hardware
Multi-Centroid PSO Classification Learning on the GPU

Multi-core CPU or GPU-accelerated Multiscale Modeling for Biomolecular Complexes

Multi-core CUDA Architecture for Parallelization of Hierarchical Text Clustering

Multi-core parallelism in a column-store

Multi-Core Programming Design Patterns: Stream Processing Algorithms for Dynamic Scene Perceptions

Multi-core programming with OpenCL: performance and portability: OpenCL in a memory bound scenario

Multi-dimensional characterization of electrostatic surface potential computation on graphics processors

Multi-dimensional characterization of temporal data mining on graphics processors

Multi-dimensional Functional Principal Component Analysis

Multi-Directional Optimisation on the GPU

Multi-domain, Higher Order Level Set Scheme for 3D Image Segmentation on the GPU

Multi-Elimination ILU Preconditioners on GPUs

Multi-fragment effects on the GPU using the k-buffer

Multi-GPGPU Cellular Automata Simulations using OpenACC

Multi-GPU accelerated multi-spin Monte Carlo simulations of the 2D Ising model

Multi-GPU Accelerated Parallel Algorithm of Wallis Transformation for Image Enhancement

Multi-GPU Acceleration of Black-Scholes Equation based Option Pricing

Multi-GPU and Multi-CPU Parallelization for Interactive Physics Simulations

Multi-GPU Based Lattice Boltzmann Method for Hemodynamic Simulation in Patient-Specific Cerebral Aneurysm

Multi-GPU based on multicriteria optimization for motion estimation system

Multi-GPU cluster wave propagation and OpenGL visualization

Multi-GPU Computing for Achieving Speedup in Real-time Aggregate Risk Analysis

Multi-GPU Distributed Parallel Bayesian Differential Topic Modelling

Multi-GPU Implementation for Iterative MR Image Reconstruction with Field Correction

Multi-GPU Implementation of a Hybrid Thermal Lattice Boltzmann Solver using the TheLMA Framework

Multi-GPU implementation of a VMAT treatment plan optimization algorithm

Multi-GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL

Multi-GPU Implementation of the Minimum Volume Simplex Analysis Algorithm for Hyperspectral Unmixing

Multi-GPU implementation of the NICAM atmospheric model

Multi-GPU Implementation of the Uniformization Method for Solving Markov Models

Multi-GPU Island-Based Genetic Algorithm

Multi-GPU Island-Based Genetic Algorithm for Solving the Knapsack Problem

Multi-GPU Load Balancing for In-Situ Simulation and Visualization

Multi-GPU Load Balancing for In-situ Visualization

Multi-GPU numerical simulation of electromagnetic waves

Multi-GPU Parallel Computing and Task Scheduling under Virtualization

Multi-GPU parallel memetic algorithm for capacitated vehicle routing problem

Multi-GPU parallelization of a 3D Bayesian CT algorithm and its application on real foam reconstruction with incomplete data set

Multi-GPU Performance of Incompressible Flow Computation by Lattice Boltzmann Method on GPU Cluster
Multi-GPU Performance Optimization of a CFD Code using OpenACC on Different Platforms

Multi-GPU performance optimization of a computational fluid dynamics code using OpenACC

Multi-GPU Rendering with Vulkan API

Multi-GPU Support on Shared Memory System using Directive-based Programming Model

Multi-GPU Support on Single Node Using Directive-Based Programming Model

Multi-GPU Support on the Marrow Algorithmic Skeleton Framework

Multi-GPU thermal lattice Boltzmann simulations using OpenACC and MPI

Multi-GPU volume rendering using MapReduce

Multi-GPU-based Swendsen-Wang multi-cluster algorithm for the simulation of two-dimensional q-state Potts model

Multi-grain Parallel Processing of Data-Clustering on Programmable Graphics Hardware

Multi-hetero Acceleration by GPU and FPGA for Astrophysics Simulation on oneAPI Environment

Multi-Kepler GPU vs. Multi-Intel MIC for spin systems simulations

Multi-kernel Data Partitioning with Channel on OpenCL-based FPGAs

Multi-layer depth peeling via fragment sort

Multi-level Debugging for Multi-stage, Parallelizing Compilers

Multi-Level Graph Layout on the GPU

Multi-level Parallelism for Incompressible Flow Computations on GPU Clusters

Multi-level Parallelism for Time- and Cost-efficient Parallel Discrete Event Simulation on GPUs

Multi-level Parallelism with MPI and OpenACC for CFD Applications

Multi-level parallelization for hybrid ACO

Multi-level Parallelization of Advanced Video Coding on Hybrid CPU/GPU Platform

Multi-line AI-assisted Code Authoring

Multi-Lingual Speech Recognition with Low-Rank Multi-Task Deep Neural Networks

Multi-mass solvers for lattice QCD on GPUs

Multi-Moment Methods for PDEs and GPUs for Large-Scale Scientific Computations

Multi-Object Geodesic Active Contours (MOGAC): A Parallel Sparse-Field Algorithm for Image Segmentation

Multi-Pass and Frame Parallel Algorithms of Motion Estimation in H.264/AVC for Generic GPU
Multi-Platform LU-Decomposition Solution in OpenCL

Multi-scale modeling of nano scale phenomenon using CUDA based HPC setup

Multi-scale neural texture classification using the GPU as a stream processing engine

Multi-scale problems, high performance computing and hybrid numerical methods

Multi-Scale Scheduling Techniques for Signal Processing Systems

Multi-Scale, Multi-Level, Heterogeneous Features Extraction and Classification of Volumetric Medical Images

Multi-Science Applications with Single Codebase – GAMER – for Massively Parallel Architectures

Multi-swarm PSO algorithm for the Quadratic Assignment Problem: a massive parallel implementation on the OpenCL platform

Multi-target DPA attacks: Pushing DPA beyond the limits of a desktop computer

Multi-target vectorization with MTPS C++ generic library

Multi-Tasking Scheduling for Heterogeneous Systems

Multi-Tenant Virtual GPUs for Optimising Performance of a Financial Risk Application

Multi-thread implementations of the lattice Boltzmann method on non-uniform grids for CPUs and GPUs
Multi-Threaded Automatic Integration Using OpenMP and CUDA

Multi-threaded Geant4 on the Xeon-Phi with Complex High-Energy Physics Geometry

Titles: 100
open PDFs: 93
packages: 8
