high performance computing on graphics processing units: hgpu.org

Posts

Feb, 9

High-precision molecular dynamics simulation of UO2-PuO2: superionic transition in uranium dioxide

Our series of articles is devoted to high-precision molecular dynamics simulation of mixed actinide-oxide (MOX) fuel in the rigid ions approximation using high-performance graphics processors (GPU). In this article we assess the 10 most relevant interatomic sets of pair potential (SPP) by reproduction of the Bredig superionic phase transition (anion sublattice premelting) in uranium dioxide. […]

CUDA

Feb, 9

High-precision molecular dynamics simulation of UO2-PuO2: pair potentials comparison

Our series of articles is devoted to high-precision molecular dynamics simulation of mixed actinide-oxide (MOX) fuel in the rigid ions approximation using high-performance graphics processors (GPU). In the first article we assess 10 most relevant interatomic sets of pair potentials (SPP) by reproduction of solid phase properties of uranium dioxide (UO2) – temperature dependences of […]

CUDA

Feb, 9

Fast Analysis of Molecular Dynamics Trajectories with Graphics Processing Units – Radial Distribution Function Histogramming

The calculation of radial distribution functions (RDFs) from molecular dynamics trajectory data is a common and computationally expensive analysis task. The rate limiting step in the calculation of the RDF is building a histogram of the distance between atom pairs in each trajectory frame. Here we present an implementation of this histogramming scheme for multiple […]

CUDA

Feb, 9

Accelerating urban fast response Lagrangian dispersion simulations using inexpensive graphics processor parallelism

Owing to the potential consequences associated with accidental or deliberate releases of chemical or biological agents in urban areas, fast response urban dispersion models must rapidly provide solutions that can be easily analyzed by researchers and emergency responders. In this paper, we describe a novel application of an existing Lagrangian dispersion modeling system to achieve […]

Feb, 9

Next-generation acceleration and code optimization for light transport in turbid media using GPUs

A highly optimized Monte Carlo (MC) code package for simulating light transport is developed on the latest graphics processing unit (GPU) built for general-purpose computing from NVIDIA – the Fermi GPU. In biomedical optics, the MC method is the gold standard approach for simulating light transport in biological tissue, both due to its accuracy and […]

CUDA

Feb, 9

Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition

Good old on-line back-propagation for plain multi-layer perceptrons yields a very low 0.35% error rate on the famous MNIST handwritten digits benchmark. All we need to achieve this best result so far are many hidden layers, many neurons per layer, numerous deformed training images, and graphics cards to greatly speed up learning.

CUDA

Feb, 9

Deep, Big, Simple Neural Nets for Handwritten Digit Recognition

Good old online backpropagation for plain multilayer perceptrons yields a very low 0.35% error rate on the MNIST handwritten digits benchmark. All we need to achieve this best result so far are many hidden layers, many neurons per layer, numerous deformed training images to avoid overfitting, and graphics cards to greatly speed up learning. Good […]

CUDA

Feb, 9

Real-time GPU color-based segmentation of football players

In this paper, we propose a multi-camera application capable of processing high resolution images and extracting features based on colors patterns over graphic processing units (GPU). The goal is to work in real time under the uncontrolled environment of a sport event like a football match. Since football players are composed for diverse and complex […]

CUDA

Feb, 9

Scalable SMT-based verification of GPU kernel functions

Interest in Graphical Processing Units (GPUs) is skyrocketing due to their potential to yield spectacular performance on many important computing applications. Unfortunately, writing such efficient GPU kernels requires painstaking manual optimization effort which is very error prone. We contribute the first comprehensive symbolic verifier for kernels written in CUDA C. Called the ‘Prover of User […]

CUDA

Feb, 8

Mobile 3D Graphics

While advanced 3D graphics techniques are well understood and supported by industry standards, this is less true in the emerging mobile applications and games market. This book redresses this imbalance, providing an in-depth look at the new OpenGL ES (The Standard for Embedded Accelerated 3D Graphics), and shows what these new embedded systems graphics libraries […]

Feb, 8

Real-time relief mapping on arbitrary polygonal surfaces

This paper presents a technique for mapping relief textures onto arbitrary polygonal models in real time. In this approach, the mapping of the relief data is done in tangent space. As a result, it can be applied to polygonal representations of curved surfaces producing correct self-occlusions, interpenetrations, shadows and per-pixel lighting effects. The approach can […]

Feb, 8

Adaptive sampling of intersectable models exploiting image and object-space coherence

We present a sampling strategy and rendering framework for intersectable models, whose surface is implicitly defined by a black box intersection test that provides the location and normal of the closest intersection of a ray with the surface. To speed up image generation despite potentially slow intersection tests, our method exploits spatial coherence by adjusting […]

OpenGL

* * *

high performance computing on graphics processing units: hgpu.org

Posts

High-precision molecular dynamics simulation of UO2-PuO2: superionic transition in uranium dioxide

High-precision molecular dynamics simulation of UO2-PuO2: pair potentials comparison

Fast Analysis of Molecular Dynamics Trajectories with Graphics Processing Units – Radial Distribution Function Histogramming

Accelerating urban fast response Lagrangian dispersion simulations using inexpensive graphics processor parallelism

Next-generation acceleration and code optimization for light transport in turbid media using GPUs

Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition

Deep, Big, Simple Neural Nets for Handwritten Digit Recognition

Real-time GPU color-based segmentation of football players

Scalable SMT-based verification of GPU kernel functions

Mobile 3D Graphics

Real-time relief mapping on arbitrary polygonal surfaces

Adaptive sampling of intersectable models exploiting image and object-space coherence

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)