Papers on hgpu.org (.txt-file)
Meta-Programming and Auto-Tuning in the Search for High Performance GPU Code
Meta-programming and Multi-stage Programming for GPGPUs
Meta-simulation of large WSN on multi-core computers
MetaBinG: Using GPUs to Accelerate Metagenomic Sequence Classification
MetaCL – A Model-Based Approach to Programming Heterogeneous Architectures Using OpenCL
MetaFork: A Compilation Framework for Concurrency Models Targeting Hardware Accelerators and Its Application to the Generation of Parametric CUDA Kernels
MetaMorph: A Library Framework for Interoperable Kernels on Multi- and Many-core Clusters
Metamorphic Testing for (Graphics) Compilers
Method for simulation of coastal terrain on GPU
Methodology of control and supervision of web connected mobile robots with CUDA technology application
Methods and Metrics for Fair Server Assessment under Real-Time Financial Workloads
Methods for Accelerating Machine Learning in High Performance Computing
Methods for GPU Acceleration of Big Data Applications
Methods for Optimizing OpenCL Applications on Heterogeneous Multicore Architectures
MGARD: A multigrid framework for high-performance, error-controlled data compression and refactoring
MGPUSim: Enabling Multi-GPU Performance Modeling and Optimization
MIC-SVM: Designing A Highly Efficient Support Vector Machine For Advanced Modern Multi-Core and Many-Core Architectures
MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC)
Microarchitectural Performance Characterization of Irregular GPU Kernels
Microbenchmarks for GPU characteristics: the occupancy roofline and the pipeline model
Microbranching in mode-I fracture using large scale simulations of amorphous and perturbed lattice models
Microlensing Observations Rapid Search for Exoplanets: MORSE code for GPUs
Micropolygon ray tracing with defocus and motion blur
MIDeA: a multi-parallel intrusion detection architecture
Migrating CUDA to oneAPI: A Smith-Waterman Case Study
Migrating from OpenGL ES to Vulkan
Migrating real-time depth image-based rendering from traditional to next-gen GPGPU
MILC Code Performance on High End CPU and GPU Supercomputer Clusters
MILC staggered conjugate gradient performance on Intel KNL
MILJS: Brand New JavaScript Libraries for Matrix Calculation and Machine Learning
MiMatrix: A Massively Distributed Deep Learning Framework on a Petascale High-density Heterogeneous Cluster
Mimetic Methods for Lagrangian Relaxation of Magnetic Fields
MIML Learning with CNNs: Yelp Restaurant Photo Classification
Mind the gap!: bridging the dichotomy of design and implementation
Minerals detection for hyperspectral images using adapted linear unmixing: LinMin
Minerva: A Scalable and Highly Efficient Training Platform for Deep Learning
MinGPU: a minimum GPU library for computer vision
miniLB: A Performance Portability Study of Lattice-Boltzmann Simulations
Minimal models for finite particles in fluctuating hydrodynamics
minimap2-fpga: Integrating hardware-accelerated chaining for efficient end-to-end long-read sequence mapping
Minimising Testing in Genetic Programming
Mining Rare Features in Fingerprints Using Core Points and Triplet-based Features
Mint: realizing CUDA performance in 3D stencil methods with annotated C
Minuet: Accelerating 3D Sparse Convolutions on GPUs
MIOpen: An Open Source Library For Deep Learning Primitives
Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on Edge GPU
Mirovia: A Benchmarking Suite for Modern Heterogeneous Computing
MITHRA: Multiple data independent tasks on a heterogeneous resource architecture
Mix-and-Match: A Model-driven Runtime Optimisation Strategy for BFS on GPUs
Mixed precision in Graphics Processing Unit
Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems
Mixed Precision Solver Scalable to 16000 MPI Processes for Lattice Quantum Chromodynamics Simulations on the Oakforest-PACS System
Mixed-Precision Embedding Using a Cache
Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration
Mixed-Precision GPU-Multigrid Solvers with Strong Smoothers
Mixed-precision Orthogonalization Scheme and Adaptive Step Size for CA-GMRES on GPUs
Mixed-precision orthogonalization scheme and its case studies with CA-GMRES on a GPU
Mixed-Resolution Patch-Matching
Mixed-Tool Performance Analysis on Hybrid Multicore Architectures
Mixing Low-Precision Formats in Multiply-Accumulate Units for DNN Training
Mixing Multi-Core CPUs and GPUs for Scientific Simulation Software
MKPipe: A Compiler Framework for Optimizing Multi-Kernel Workloads in OpenCL for FPGA
MLitB: Machine Learning in the Browser
MLS-based scalar fields over triangle meshes and their application in mesh processing
MNN: A Universal and Efficient Inference Engine
Mobile GPGPU Acceleration of Embodied Robot Simulation
Mobile GPU Computing Based Filter Bank Convolution for Three-dimensional Wavelet Transform
MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU
MobiRT: an implementation of OpenGL ES-based CPU-GPU hybrid ray tracer for mobile devices
Model Coupling between the Weather Research and Forecasting Model and the DPRI Large Eddy Simulator for Urban Flows on GPU-accelerated Multicore Systems
Model-Based 3D Object Tracking Using an Extended-Extended Kalman Filter and Graphics Rendered Measurements
Model-based optimization of MPDATA on Intel Xeon Phi through load imbalancing
Model-Based Warp-Level Tiling for Image Processing Programs on GPUs
Model-driven autotuning of sparse matrix-vector multiply on GPUs
Model-driven optimisation of memory hierarchy and multithreading on GPUs
Model-Driven Tile Size Selection for DOACROSS Loops on GPUs
Model-independent partial wave analysis using a massively-parallel fitting framework
Model-T: Rethinking the OS for terabit speeds
Modeling and Evaluation of Synchronous Stochastic Gradient Descent in Distributed Deep Learning on Multiple GPUs
Modeling and generating complex motion blur for real-time tracking
Modeling and Optimization of Parallel Matrix-based Computations on GPU
Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures
Modeling Deep Learning Accelerator Enabled GPUs
Modeling GPU Dynamic Parallelism for Self Similar Density Workloads
Modeling GPU-CPU Workloads and Systems
Modeling Image Patches with a Generic Dictionary of Mini-Epitomes
Modeling of Heat Diffusion Through Isotropic Media Using Graphical Processing Units
Modeling of Heterogeneous Architecture with GPU to Exascale System
Modeling of High Performance Programs to Support Heterogeneous Computing
Modeling of the behavior of 222 Rn progeny in diffusion chamber using CUDA
Modeling of tsunami waves and atmospheric swirling flows with graphics processing unit
Modeling Parallel Programs for Heterogeneous Computing
Modeling Parallel Programs using Large Language Models
Modeling Rotor Wakes with a Hybrid OVERFLOW-Vortex Method on a GPU Cluster
Modeling system for GPU parallel tasks performance simulation
Titles: 100
open PDFs: 96
packages: 25