high performance computing on graphics processing units: hgpu.org

Papers on hgpu.org (.txt-file)

Modular Arithmetic for Solving Linear Equations on the GPU

Modular FPGA Systems with Support for Dynamic Workloads and Virtualisation

Modular Resultant Algorithm for Graphics Processors

Modular Technology in the Modelling of Large Virtual Environments in Driving Simulators

Moim: A Multi-GPU MapReduce Framework

Molecular Activity Prediction using Deep Learning Software Library

Molecular Distance Geometry Optimization Using Geometric Build-up and Evolutionary Techniques on GPU

Molecular Docking on FPGA and GPU Platforms

Molecular dynamics for long-range interacting systems on Graphic Processing Units

Molecular Dynamics on a Grand Scale

Molecular dynamics recipes for genome research

Molecular Dynamics Simulation Based on Hadoop MapReduce

Molecular dynamics simulation of complex multiphase flow on a computer cluster with GPUs

Molecular Dynamics Simulation of Macromolecules Using Graphics Processing Unit

Molecular Dynamics Simulation of Multi-Scale Flows on GPUs

Molecular dynamics simulation of the supercooled Al melt on GPUs

Molecular dynamics simulation of UO2 nanocrystals melting

Molecular dynamics simulations of the relaxation processes in the condensed matter on GPUs

Molecular Dynamics Simulations on Commodity GPUs with CUDA

Molecular dynamics simulations through GPU video games technologies

Molecular Dynamics Simulations Using Graphics Processing Units

Molecular dynamics simulations with many-body potentials on multiple GPUs – the implementation, package and performance

Molecular Simulation of ab Initio Protein Folding for a Millisecond Folder NTL9(1-39)

Molecular Simulations using CUDA

Molecular structural mechanics approach to carbon nanotubes on graphics processing units

Monadic Deep Learning

Monitoring Collective Communication Among GPUs

Monitoring Large-scale Microblog on GPUs

Monitoring Multiple Streams with Dynamic Time Warping using Graphic Processors

Montage: A Neural Network Language Model-Guided JavaScript Engine Fuzzer

Montblanc: GPU accelerated Radio Interferometer Measurement Equations in support of Bayesian Inference for Radio Observations

Monte Carlo integration on GPU

Monte Carlo methods for massively parallel computers

Monte Carlo Modeling of Electron Transport Using CUDA Technology

Monte Carlo Path Tracing with OpenCL

Monte Carlo Radiative Transport on the GPU

Monte Carlo randomization tests for large-scale abundance datasets on the GPU

Monte Carlo simulation of photon migration in 3D turbid media accelerated by graphics processing units

Monte Carlo simulations on Graphics Processing Units

Monte-Carlo Black-Scholes Implementation using OpenCL Standard

More Bang For Your Buck(et): Fast and Space-efficient Hardware-accelerated Coarse-granular Indexing on GPUs

Morph Algorithms on GPUs

Morphological Proximity Priors: Spatial Relationships for Semantic Segmentation

Motion Compensation and Reconstruction of H.264/AVC Video Bitstreams using the GPU

Motion Estimation for H.264/AVC using Programmable Graphics Hardware

Motion Estimation with Non-Local Total Variation Regularization

Motion planning for autonomous driving with a conformal spatiotemporal lattice

Movement Tracking in Terrain Conditions Accelerated with CUDA

Moving Least-Squares Reconstruction of Large Models with GPUs

Mpache: Interaction Aware Multi-level Cache Bypassing on GPUs

MPC Toolbox with GPU Accelerated Optimization Algorithms

MPC: A Massively Parallel Compression Algorithm for Scientific Data

MPI Derived Datatypes Processing on Noncontiguous GPU-resident Data

MPI Parallelization of GPU-based Lattice Boltzmann Simulations

MPI within a GPU

MPI-ACC: An Integrated and Extensible Approach to Data Movement in Accelerator-Based Systems

MPI-CUDA parallelization of a finite-strip program for geometric nonlinear analysis: A hybrid approach

MPI-GIS: New Parallel Overlay Algorithm and System Prototype

MPI-GPU parallelism in iterative eigensolvers for block-tridiagonal matrices

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

MR-API: A Comprehensive API Framework for Heterogeneous Multi-core Systems using Map Reduce Programming Model

Mr. Scan: Extreme Scale Density-Based Clustering using a Tree-Based Network of GPGPU Nodes

MrBayes on a Graphics Processing Unit

MrBayes tgMC3: A Tight GPU Implementation of MrBayes

MRCUDA: MapReduce Acceleration Framework Based on GPU

MRPB: Memory Request Prioritization for Massively Parallel Processors

MSA-CUDA: Multiple Sequence Alignment on Graphics Processing Units with CUDA

MSCCL++: Rethinking GPU Communication Abstractions for Cutting-edge AI Applications

MSREP: A Fast yet Light Sparse Matrix Framework for Multi-GPU Systems

MSTg: Cryptographically strong pseudorandom number generator and its realization

mu-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching

mu-grind: A Framework for Dynamically Instrumenting HLS-Generated RTL

Multi Agent Navigation on the GPU

Multi GPU Implementation of Iterative Tomographic Reconstruction Algorithms

Multi GPU Implementation of the Simplex Algorithm

Multi GPU Performance of Conjugate Gradient Algorithm with Staggered Fermions

Multi GPU Performance of Conjugate Gradient Solver with Staggered Fermions in Mixed Precision

Multi scale block histogram of template feature for pedestrian detection

Multi- and many-core data mining with adaptive sparse grids

Multi-Agent Systems and General-Purpose Computing on Graphics Processing Units: A Survey

Multi-agent traffic simulation with CUDA

Multi-camera real-time depth estimation with discontinuity handling on PC graphics hardware

Multi-Centroid PSO Classification Learning on the GPU

Multi-core CPU or GPU-accelerated Multiscale Modeling for Biomolecular Complexes

Multi-core CUDA Architecture for Parallelization of Hierarchical Text Clustering

Multi-core parallelism in a column-store

Multi-Core Programming Design Patterns: Stream Processing Algorithms for Dynamic Scene Perceptions

Multi-core programming with OpenCL: performance and portability: OpenCL in a memory bound scenario

Multi-dimensional characterization of electrostatic surface potential computation on graphics processors

Multi-dimensional characterization of temporal data mining on graphics processors

Multi-dimensional Functional Principal Component Analysis

Multi-Directional Optimisation on the GPU

Multi-domain, Higher Order Level Set Scheme for 3D Image Segmentation on the GPU

Multi-Elimination ILU Preconditioners on GPUs

Multi-fragment effects on the GPU using the k-buffer

Multi-GPGPU Cellular Automata Simulations using OpenACC

Multi-GPU accelerated multi-spin Monte Carlo simulations of the 2D Ising model

Multi-GPU Accelerated Parallel Algorithm of Wallis Transformation for Image Enhancement

Multi-GPU Acceleration of Black-Scholes Equation based Option Pricing

Multi-GPU and Multi-CPU Parallelization for Interactive Physics Simulations

Brief statistics for this page

Titles: 100

Download open PDFs: 89

Package packages: 13

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Efficient deep learning inference on end devices

Ouroboros: Virtualized Queues for dynamic memory management

Dynamic Memory Management on GPUs with SYCL

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Papers on hgpu.org (.txt-file)

Recent source codes

XaaS containers

microSYCL: SYCL micro-benchmarks repository

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Ouroboros: Virtualized Queues for dynamic memory management

Most viewed papers (last 30 days)