Papers on hgpu.org (.txt-file)
Noise Removal from Remote Sensed Images by NonLocal Means with OpenCL Algorithm
Noise-resistant fitting for spherical harmonics
Non-blocking programming on multi-core graphics processors: (extended asbtract)
Non-Determinism in TensorFlow ResNets
Non-deterministic parallelism considered useful
Non-Hydrostatic Pressure Shallow Flows: GPU Implementation Using Finite-Volume and Finite-Difference Scheme
Non-local means denoising algorithm accelerated by GPU
Non-Local Total Generalized Variation for Optical Flow Estimation
Non-Parametric Adaptive Network Pruning
Non-recursive beam search on GPU for formal concept analysis
Non-rigid multi-modal registration on the GPU
Non-separable 2D, 3D and 4D filtering with CUDA
Non-steady relaxation and critical exponents at the depinning transition
Non-symmetric magnetohydrostatic equilibria: a multigrid approach
Non-Uniform Domain Decomposition for Heterogeneous Accelerated Processing Units
Non-Uniformly Partitioned Block Convolution on Graphics Processing Units
Nonlinear Dynamic Analysis Efficiency by Using a GPU Parallelization
Nonlinear dynamic finite element analysis with GPU
Nonlinear optimization framework for image-based modeling on programmable graphics hardware
Nonmetric Priors for Continuous Multilabel Optimization
Nonnegative Tensor Factorization Accelerated Using GPGPU
Nonperturbative Quantum Field Theory in Astrophysics
Not Half Bad: Exploring Half-Precision in Graph Convolutional Neural Networks
NOVA: A Functional Language for Data Parallelism
Novel Architectures: Solving Computational Problems with GPU Computing
Novel Data-Partitioning Algorithms for Performance and Energy Optimization of Data-Parallel Applications on Modern Heterogeneous HPC Platforms
Novel GPU Implementation of Jacobi Algorithm for Karhunen-Loeve Transform of Dense Matrices
Novel implementations of recursive discrete wavelet transform for real time computation with multicore systems on chip (SOC)
Novel insights on atomic synchronization for sort-based group-by on GPUs
Novel Methodologies for Predictable CPU-To-GPU Command Offloading
Novel Multi-Layer Network Decomposition Boosting Acceleration of Multi-core Algorithms
Novel Parallel Approaches to Efficiently Solve Spatial Problems on Heterogeneous CPU-GPU Systems
Novel Parallelization Strategies for High-Performance DNN Training on HPC Systems
NPBench: A Benchmarking Suite for High-Performance NumPy
NQueens on CUDA: Optimization Issues
NT-SIM: A Co-Simulator for Networked Signal Processing Applications
Nucleation of nanoparticles in a coarse grained fluid using OpenCL
Nucleation Studies on Graphics Processing Units
Nuclei: GPU-Accelerated Many-Core Network Coding
NUMA Data-Access Bandwidth Characterization and Modeling
NUMA-Aware Image Compositing on Multi-GPU Platform
Numerical Accuracy Analysis Based on the Discrete Stochastic Arithmetic on Multiprocessor Platforms
Numerical Accuracy Differences in CPU and GPGPU Codes
Numerical computations in Java with CUDA
Numerical Computations with GPUs
Numerical cosmology on the GPU with Enzo and Ramses
Numerical integration on GPUs for higher order finite elements
Numerical investigations on nonlinear nonparaxial beam propagation using graphics processing units
Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
Numerical Model of Shallow Water: the Use of NVIDIA CUDA Graphics Processors
Numerical Modeling of Atmospheric Vortices
Numerical modeling of gravitational wave sources accelerated by OpenCL
Numerical Ocean Modeling and Simulation with CUDA
Numerical Parallel Processing Based on GPU with CUDA Architecture
Numerical Precision and Benchmarking Very-High-Order Integration of Particle Dynamics on GPU Accelerators
Numerical resolution of conservation laws with OpenCL
Numerical Simulation for the MHD System in 2D Using OpenCL
Numerical simulation of 3D particulate flows based on GPU technology
Numerical Simulation of Melting with Natural Convection Based on Lattice Boltzmann Method and Performed with CUDA Enabled GPU
Numerical Simulation of the Complex Ginzburg-Landau Equation on GPUs with CUDA
Numerical Simulation of the Frank-Kamenetskii PDE: GPU vs. CPU Computing
Numerical simulations of acoustic waves with the graphic acceleration GAMER code
Numerical solution of PDEs with hybrid and heterogeneous computing models
Numerical Solutions of Heat and Mass Transfer with the Third Kind Boundary and Initial Conditions in Capillary Porous Media Using Programmable Graphics Hardware
Numerical Study of Geometric Multigrid Methods on CPU–GPU Heterogeneous Computers
NUPAR: A Benchmark Suite for Modern GPU Architectures
NVIDIA CUDA software and gpu parallel computing architecture
NVIDIA SimNet: an AI-accelerated multi-physics simulation framework
NVIDIA Tensor Core Programmability, Performance & Precision
NVIDIA Tesla: A Unified Graphics and Computing Architecture
Object Detection Based Handwriting Localization
Object Oriented Framework for CUDA based Pyramidal Image Blending
Object oriented framework for real-time image processing on GPU
Object Space Based Collision Detection for Cloth Simulation on the GPU
Object support for OpenMP-style programming of GPU clusters in Java
Object-oriented stream programming using aspects
Object-oriented stream programming using Aspects: a high-productivity programming paradigm for hybrid platforms
Objective-Driven Workload Allocation in Heterogeneous Computing Systems
Obsidian: GPU Kernel Programming in Haskell (thesis)
Obsidian: GPU Programming in Haskell
Obtaining a 35x Speedup in 2D Phase Unwrapping Using Commodity Graphics Processors
OCCA: A unified approach to multi-threading languages
Ocean wave simulation in real-time using GPU
Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
Ocelot/HyPE: Optimized Data Processing on Heterogeneous Hardware
OCLoptimizer: An Iterative Optimization Tool for OpenCL
OCT on CUDA: Speeding up the image reconstruction algorithm for an Optical Coherence Tomography system using NVIDIA’s CUDA platform
Octree Light Propagation Volumes
Odeint – Solving ordinary differential equations in C++
Odyssey: A Public GPU-Based Code for General-Relativistic Radiative Transfer in Kerr Spacetime
Off-axis quantitative phase imaging processing using CUDA: toward real-time applications
Offload Annotations: Bringing Heterogeneous Computing to Existing Libraries and Workloads
Offload Compiler Runtime for the Intel Xeon Phi Coprocessor
Offloading Critical Security Operations to the GPU
Titles: 100
open PDFs: 87
packages: 14