Papers on hgpu.org (.txt-file)
MacroSS: macro-SIMDization of streaming applications
Maestro: Data Orchestration and Tuning for OpenCL Devices
MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs
MAGMA Embedded: Towards a Dense Linear Algebra Library for Energy Efficient Extreme Computing
MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing
Magneto-hydrodynamics simulation in astrophysics
Magnetohydrodynamics on Heterogeneous architectures: a performance comparison
Magnetohydrodynamics simulations on graphics processing units
Maintaining constant frame rates in 3D texture-based volume rendering
Makespan computation for GPU threads running on a single streaming multiprocessor
Making Human Connectome Faster: GPU Acceleration of Brain Network Analysis
Making the case of GPUs in courses on computational physics
MALBEC: a new CUDA-C ray-tracer in General Relativity
MambaCPU: Enhanced Correlation Mining with State Space Models for CPU Performance Prediction
Managing Extreme Heterogeneity in Next Generation HPC Systems
Managing heterogeneous device memory using C++17 memory resources
Managing the Topology of Heterogeneous Cluster Nodes with Hardware Locality (hwloc)
Managing, Profiling, and Optimizing Heterogeneous GPU Workloads
Manas: Mining Software Repositories to Assist AutoML
ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills
Many Cores, Many Models: GPU Programming Model vs. Vendor Compatibility Overview
Many-body quantum chemistry on graphics processing units
Many-Core Algorithms for Combinatorial Optimization
Many-core algorithms for statistical phylogenetics
Many-core applications to online track reconstruction in HEP experiments
Many-Core Architectures: Hardware-Software Optimization and Modeling Techniques
Many-core GPU computing with NVIDIA CUDA
Many-core parallel computing – Can compilers and tools do the heavy lifting?
Many-Core vs. Many-Thread Machines: Stay Away From the Valley
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications
Many-threaded Differential Evolution on the GPU
Many-threaded implementation of differential evolution for the CUDA platform
Manycore high-performance computing in bioinformatics
Manycore processing of repeated k-NN queries over massive moving objects observations
Manycore processing of repeated range queries over massive moving objects observations
MAP-based Brain Tissue Segmentation using Manifold Learning and Hierarchical Max-Flow regularization
Map-reduce as a Programming Model for Custom Computing Machines
MapCG: writing parallel program portable between CPU and GPU
MapGraph: A High Level API for Fast Development of High Performance Graph Analytics on GPUs
Mapping a Data-Flow Programming Model onto Heterogeneous Platforms
Mapping a Dataflow Programming Model onto Heterogeneous Architectures
Mapping a Guided Image Filter on the HARP Reconfigurable Architecture Using OpenCL
Mapping computational concepts to GPUs
Mapping dynamic programming algorithms on graphics processing units
Mapping High-Fidelity Volume Rendering for Medical Imaging to CPU, GPU and Many-Core Architectures
Mapping Iterative Medical Imaging Algorithm on Cell Accelerator
Mapping of a film grain removal algorithm to a heterogeneous reconfigurable architecture
Mapping parallel programs to heterogeneous multi-core systems
Mapping Streaming Applications to OpenCL
Mapping the Arnold web with a GPU-supercomputer
Mapping the Arnold web with a graphic processing unit
Mapping the SBR and TW-ILDCs to Heterogeneous CPU-GPU Architecture for Fast Computation of Electromagnetic Scattering
MapReduce for Counting Word Frequencies with MPI and GPUs
MapSQ: A MapReduce-based Framework for SPARQL Queries on GPU
MARC: A Many-Core Approach to Reconfigurable Computing
March of the Froblins: simulation and rendering massive crowds of intelligent and detailed creatures on GPU
Marian: Cost-effective High-Quality Neural Machine Translation in C++
Markerless View-Independent Registration of Multiple Distorted Projectors on Extruded Surfaces Using an Uncalibrated Camera
Markov Chain Monte Carlo on the GPU
Mars: a MapReduce framework on graphics processors
Mars: Accelerating MapReduce with Graphics Processors
Mascar: Speeding up GPU Warps by Reducing Memory Pitstops
MASCOT: Fast and Highly Scalable SVM Cross-validation using GPUs and SSDs
Mashing load balancing algorithm to boost hybrid kernels in molecular dynamics simulations
Masivo: Parallel Simulation Model Based on OpenCL for Massive Public Transportation Systems’ Routes
Mass Estimation from Images using Deep Neural Network and Sparse Ground Truth
Mass-spring systems on the GPU
Massive Exploration of Neural Machine Translation Architectures
Massive exploration of perturbed conditions of the blood coagulation cascade through GPU parallelization
Massive Image Editing on the Cloud
Massive Parallel Implementation of ODE Solvers
Massive parallel LDPC decoding on GPU
Massive Parallelism with GPUs for Centrality Ranking in Complex Networks
Massive parallelization of combinatorial statistical genetics analyses porting machine learning methods on general purpose graphics processing units (GPU)
Massive parallelization of serial inference algorithms for a complex generalized linear model
Massively Deep Artificial Neural Networks for Handwritten Digit Recognition
Massively LDPC Decoding on Multicore Architectures
Massively Parallel A* Search on a GPU
Massively Parallel Algorithms for CFD Simulation and Optimization on Heterogeneous Many-Core Architectures
Massively Parallel Analysis of Similarity Matrices on Heterogeneous Hardware
Massively parallel approximate Gaussian process regression
Massively Parallel Computation of Accurate Densities for N-body Dark Matter Simulations using the Phase-Space-Element Method
Massively parallel computation using graphics processors with application to optimal experimentation in dynamic control
Massively Parallel Computing in Economics
Massively Parallel Construction of the Cell Graph
Massively parallel differential evolution-pattern search optimization with graphics hardware acceleration: an investigation on bound constrained optimization problems
Massively Parallel Finite Element Simulator for Full-Chip STI Stress Analysis
Massively Parallel GPU Computing of Continuum Robotic Dynamics
Massively Parallel GPU Memory Compaction
Massively Parallel Identification of Intersection Points for GPGPU Ray Tracing
Massively parallel implementation of cyclic LDPC codes on a general purpose graphics processing unit
Massively Parallel Jacobian Computation
Massively Parallel kNN using CUDA on Spam-Classification
Massively Parallel Localization of Pulsed Signal Transitions Using a GPU
Massively Parallel Logic Simulation with GPUs
Massively Parallel Lossless Compression of Medical Images Using Least-Squares Prediction and Arithmetic Coding
Massively parallel Monte Carlo for many-particle simulations on GPUs
Titles: 100
open PDFs: 93
packages: 25