Papers on hgpu.org (.txt-file)
Mapping a Guided Image Filter on the HARP Reconfigurable Architecture Using OpenCL

Mapping computational concepts to GPUs

Mapping dynamic programming algorithms on graphics processing units

Mapping High-Fidelity Volume Rendering for Medical Imaging to CPU, GPU and Many-Core Architectures

Mapping Iterative Medical Imaging Algorithm on Cell Accelerator

Mapping of a film grain removal algorithm to a heterogeneous reconfigurable architecture

Mapping parallel programs to heterogeneous multi-core systems

Mapping Streaming Applications to OpenCL

Mapping the Arnold web with a GPU-supercomputer

Mapping the Arnold web with a graphic processing unit

Mapping the SBR and TW-ILDCs to Heterogeneous CPU-GPU Architecture for Fast Computation of Electromagnetic Scattering

MapReduce for Counting Word Frequencies with MPI and GPUs

MapSQ: A MapReduce-based Framework for SPARQL Queries on GPU

MARC: A Many-Core Approach to Reconfigurable Computing
March of the Froblins: simulation and rendering massive crowds of intelligent and detailed creatures on GPU

Marian: Cost-effective High-Quality Neural Machine Translation in C++

Markerless View-Independent Registration of Multiple Distorted Projectors on Extruded Surfaces Using an Uncalibrated Camera

Markov Chain Monte Carlo on the GPU

Mars: a MapReduce framework on graphics processors

Mars: Accelerating MapReduce with Graphics Processors

Mascar: Speeding up GPU Warps by Reducing Memory Pitstops

MASCOT: Fast and Highly Scalable SVM Cross-validation using GPUs and SSDs

Mashing load balancing algorithm to boost hybrid kernels in molecular dynamics simulations

Masivo: Parallel Simulation Model Based on OpenCL for Massive Public Transportation Systems’ Routes

Mass Estimation from Images using Deep Neural Network and Sparse Ground Truth

Mass-spring systems on the GPU
Massive Exploration of Neural Machine Translation Architectures

Massive exploration of perturbed conditions of the blood coagulation cascade through GPU parallelization

Massive Image Editing on the Cloud

Massive Parallel Implementation of ODE Solvers

Massive parallel LDPC decoding on GPU
Massive Parallelism with GPUs for Centrality Ranking in Complex Networks

Massive parallelization of combinatorial statistical genetics analyses porting machine learning methods on general purpose graphics processing units (GPU)

Massive parallelization of serial inference algorithms for a complex generalized linear model

Massively Deep Artificial Neural Networks for Handwritten Digit Recognition

Massively LDPC Decoding on Multicore Architectures
Massively Parallel A* Search on a GPU

Massively Parallel Algorithms for CFD Simulation and Optimization on Heterogeneous Many-Core Architectures

Massively Parallel Analysis of Similarity Matrices on Heterogeneous Hardware

Massively parallel approximate Gaussian process regression

Massively Parallel Computation of Accurate Densities for N-body Dark Matter Simulations using the Phase-Space-Element Method

Massively parallel computation using graphics processors with application to optimal experimentation in dynamic control

Massively Parallel Computing in Economics

Massively Parallel Construction of the Cell Graph

Massively parallel differential evolution-pattern search optimization with graphics hardware acceleration: an investigation on bound constrained optimization problems

Massively Parallel Finite Element Simulator for Full-Chip STI Stress Analysis
Massively Parallel GPU Computing of Continuum Robotic Dynamics

Massively Parallel GPU Memory Compaction

Massively Parallel Identification of Intersection Points for GPGPU Ray Tracing

Massively parallel implementation of cyclic LDPC codes on a general purpose graphics processing unit
Massively Parallel Jacobian Computation

Massively Parallel kNN using CUDA on Spam-Classification

Massively Parallel Localization of Pulsed Signal Transitions Using a GPU

Massively Parallel Logic Simulation with GPUs

Massively Parallel Lossless Compression of Medical Images Using Least-Squares Prediction and Arithmetic Coding

Massively parallel Monte Carlo for many-particle simulations on GPUs

Massively Parallel Network Coding on GPUs

Massively Parallel Neural Encoding and Decoding of Visual Stimuli

Massively Parallel Ray Tracing Algorithm Using GPU

Massively parallel read mapping on GPUs with PEANUT

Massively parallel read mapping on GPUs with the q-group index and PEANUT

Massively Parallel Sequential Monte Carlo for Bayesian Inference

Massively parallel simulations of relativistic fluid dynamics on graphics processing units with CUDA

Massively Parallel Suffix Array Queries and On-Demand Phrase Extraction for Statistical Machine Translation Using GPUs

Massively parallel two-dimensional TLM algorithm on graphics processing units

Massively parallelizable list-mode reconstruction using a Monte Carlo-based elliptical Gaussian model

Massively Parallelized Monte Carlo Simulation and its Applications in Finance

Massively parallelized replica-exchange simulations of polymers on GPUs

Massively-Parallel Lossless Data Decompression

Mastering Atari with Discrete World Models

Mastering Software Variant Explosion for GPU Accelerators

Matched Filter Computation on FPGA, Cell and GPU
MatConvNet – Convolutional Neural Networks for MATLAB

Material Removal Simulation and Cutting Force Prediction of Multi-Axis Machining Processes on General-Purpose Graphics Processing Units

Mathematical limits of parallel computation for embedded systems

MATLAB and Python for GPU Computing

MATLAB graphical interface for GPU based FDTD method
MATLAB Medical Images Classification on Graphics Processors

MATLAB Parallelization through Scalarization

Matrix Computations and Optimization in Apache Spark

Matrix Convolution using Parallel Programming

Matrix Factorization on GPUs with Memory Optimization and Approximate Computing

Matrix inversion speed up with CUDA

Matrix Multiplication Beyond Auto-Tuning: Rewrite-based GPU Code Generation

Matrix Multiplication on GPUs with On-Line Fault Tolerance
Matrix Multiplication Using Only Addition

Matrix Multiplication with CUDA – A basic introduction to the CUDA programming model

Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs

Matrix-Matrix Multiplications on GPUs for Accelerating a Parallel Fluid Dynamics Code

maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell GPUs

Maximal Information Coefficient Analysis

Maximize Performance on GPUs Using the Rake-based Optimization: A Case Study

Maximizing Parallelism and GPU Utilization For Direct GPU Compilation Through Ensemble Execution

Maximum likelihood event estimation and list-mode image reconstruction on GPU hardware

Maximum mipmaps for fast, accurate, and scalable dynamic height field rendering

MaxSSmap: A GPU program for short read mapping with the maximum scoring subsequence

MC-RANSAC: A Pre-processing Model for RANSAC using Monte Carlo method implemented on a GPU

MCBooster: a library for fast Monte Carlo generation of phase-space decays on massively parallel platforms

MCS 572: Introduction to Supercomputing

Titles: 100
open PDFs: 91
packages: 22
