1173

Papers on hgpu.org (.txt-file)

Medusa: Simplified Graph Processing on GPUs Download Package

Mega-KV: A Case for GPUs to Maximize the Throughput of In-Memory Key-Value Stores Download Package

MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph Download Package

Megakernels Considered Harmful: Wavefront Path Tracing on GPUs Download

Megapixel Topology Optimization on a Graphics Processing Unit

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU Download Package

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism Download Package

Melia: A MapReduce Framework on OpenCL-based FPGAs Download Package

MELT-a Translated Domain Specific Language Embedded in the GCC Compiler Download

MemAscend: System Memory Optimization for SSD-Offloaded LLM Fine-Tuning Download

MemcachedGPU: Scaling-up Scale-out Key-value Stores Download Package

Memory Access Optimized Implementation of Cyclic and Quasi-Cyclic LDPC Codes on a GPGPU

Memory Bandwidth and Latency in HPC: System Requirements and Performance Impact Download

Memory Bandwidth Efficient Two-Dimensional Fast Fourier Transform Algorithm and Implementation for Large Problem Sizes Download

Memory Efficient Mixed-Precision Optimizers Download

Memory Interference and Performance Prediction in GPU-Accelerated Heterogeneous Systems Download

Memory layout in GPU implementation of lattice Boltzmann method for sparse 3D geometries Download

Memory Optimization for Deep Networks Download Package

Memory Saving Discrete Fourier Transform on GPUs Download

Memory transfer optimization for a lattice Boltzmann solver on Kepler architecture nVidia GPUs Download

Memory-Efficient Acceleration of Block Low-Rank Foundation Models on Resource Constrained GPUs Download Package

Memory-efficient Adaptive Subdivision for Software Rendering on the GPU Download Package

Memory-efficient implementation of a graphics processor-based cluster detection algorithm for large spatial databases

Memory-Efficient Implementation of DenseNets Download Package

Memory-Efficient Object-Oriented Programming on GPUs Download Package

Memory-Efficient Single-Pass GPU Rendering of Multi-fragment Effects Download

Memory-level and Thread-level Parallelism Aware GPU Architecture Performance Analytical Model Download

Memory-Scalable GPU Spatial Hierarchy Construction Download

Merge or Separate? Multi-job Scheduling for OpenCL Kernels on CPU/GPU Platforms Download

Merge: a programming model for heterogeneous multi-core systems Download

Mersenne Twister Random Number Generation on FPGA, CPU and GPU Download

Mesh deformations in X3D via CUDA with freeform deformation lattices Download

Mesh Independent Loop Fusion for Unstructured Mesh Applications Download

Mesh mutation in programmable graphics hardware Download

Meshfree/GFEM in hardware-efficiency prospective Download

Message passing for GPGPU clusters: CudaMPI Download Package

Message Passing Interface support for the runtime adaptive multi-processor system-on-chip RAMPSoC

Message passing on data-parallel architectures Download

Meta Networks for Neural Style Transfer Download Package

Meta-Programming and Auto-Tuning in the Search for High Performance GPU Code Download

Meta-programming and Multi-stage Programming for GPGPUs Download

Meta-simulation of large WSN on multi-core computers Download

MetaBinG: Using GPUs to Accelerate Metagenomic Sequence Classification Download Package

MetaCL – A Model-Based Approach to Programming Heterogeneous Architectures Using OpenCL Download

MetaFork: A Compilation Framework for Concurrency Models Targeting Hardware Accelerators and Its Application to the Generation of Parametric CUDA Kernels Download Package

MetaMorph: A Library Framework for Interoperable Kernels on Multi- and Many-core Clusters Download

Metamorphic Testing for (Graphics) Compilers Download

Metaprogramming GPUs with Sh Download

Method for simulation of coastal terrain on GPU

Methodology of control and supervision of web connected mobile robots with CUDA technology application Download

Methods and Metrics for Fair Server Assessment under Real-Time Financial Workloads Download

Methods for Accelerating Machine Learning in High Performance Computing Download

Methods for GPU Acceleration of Big Data Applications Download

Methods for Optimizing OpenCL Applications on Heterogeneous Multicore Architectures Download

MGARD: A multigrid framework for high-performance, error-controlled data compression and refactoring Download Package

MGPUSim: Enabling Multi-GPU Performance Modeling and Optimization Download Package

MIC-SVM: Designing A Highly Efficient Support Vector Machine For Advanced Modern Multi-Core and Many-Core Architectures Download

MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC) Download Package

Microarchitectural Performance Characterization of Irregular GPU Kernels Download

Microbenchmark-Driven Analytical Performance Modeling Across Modern GPU Architectures Download

Microbenchmarking NVIDIA’s Blackwell Architecture: An in-depth Architectural Analysis Download

Microbenchmarks for GPU characteristics: the occupancy roofline and the pipeline model Download Package

Microbranching in mode-I fracture using large scale simulations of amorphous and perturbed lattice models Download

Microlensing Observations Rapid Search for Exoplanets: MORSE code for GPUs Download

Micropolygon ray tracing with defocus and motion blur

MIDeA: a multi-parallel intrusion detection architecture Download

Migrating CUDA to oneAPI: A Smith-Waterman Case Study Download

Migrating from OpenGL ES to Vulkan Download

Migrating real-time depth image-based rendering from traditional to next-gen GPGPU

MILC Code Performance on High End CPU and GPU Supercomputer Clusters Download

MILC on GPUs Download

MILC staggered conjugate gradient performance on Intel KNL Download

MILJS: Brand New JavaScript Libraries for Matrix Calculation and Machine Learning Download Package

MiMatrix: A Massively Distributed Deep Learning Framework on a Petascale High-density Heterogeneous Cluster Download

MIMD Interpretation on a GPU Download

Mimetic Methods for Lagrangian Relaxation of Magnetic Fields Download

Mìmir: A real-time interactive visualization library for CUDA programs Download

MIML Learning with CNNs: Yelp Restaurant Photo Classification Download

Mind the gap!: bridging the dichotomy of design and implementation Download

Minerals detection for hyperspectral images using adapted linear unmixing: LinMin Download

Minerva: A Scalable and Highly Efficient Training Platform for Deep Learning Download

MinGPU: a minimum GPU library for computer vision Download Package

miniLB: A Performance Portability Study of Lattice-Boltzmann Simulations Download Package

Minimal models for finite particles in fluctuating hydrodynamics Download Package

minimap2-fpga: Integrating hardware-accelerated chaining for efficient end-to-end long-read sequence mapping Download Package

Minimising Testing in Genetic Programming Download

Mining Rare Features in Fingerprints Using Core Points and Triplet-based Features Download

Mint: realizing CUDA performance in 3D stencil methods with annotated C Download Package

Minuet: Accelerating 3D Sparse Convolutions on GPUs Download Package

MIOpen: An Open Source Library For Deep Learning Primitives Download Package

Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on Edge GPU Download

Mirovia: A Benchmarking Suite for Modern Heterogeneous Computing Download Package

MITHRA: Multiple data independent tasks on a heterogeneous resource architecture Download

Mix-and-Match: A Model-driven Runtime Optimisation Strategy for BFS on GPUs Download Package

Mixed precision in Graphics Processing Unit Download

Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems Download

Mixed Precision Solver Scalable to 16000 MPI Processes for Lattice Quantum Chromodynamics Simulations on the Oakforest-PACS System Download Package

Mixed-Precision Embedding Using a Cache Download

Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration Download Package

Mixed-Precision GPU-Multigrid Solvers with Strong Smoothers Download

 

Brief statistics for this page

Titles: 100

Download open PDFs: 93

Package packages: 32

* * *

* * *

HGPU group © 2010-2026 hgpu.org

All rights belong to the respective authors

Contact us: