high performance computing on graphics processing units: hgpu.org

Posts

Jan, 31

Highly interactive computational steering for coupled 3D flow problems utilizing multiple GPUs

Most computational fluid dynamics (CFD) simulations require massive computational power which is usually provided by traditional High Performance Computing (HPC) environments. Although interactivity of the simulation process is highly appreciated by scientists and engineers, due to limitations of typical HPC environments, present CFD simulations are usually executed non interactively. A recent trend is to harness […]

Jan, 31

Top-Performance Tokenization and Small-Ruleset Regular Expression Matching: A Quantitative Performance Analysis and Optimization Study on the Cell/B.E. Processor

In the last decade, the volume of unstructured data that Internet and enterprise applications create and consume has been growing at impressive rates. The tools we use to process these data are search engines, business analytics suites, natural-language processors and XML processors. These tools rely on tokenization, a form of regular expression matching aimed at […]

Jan, 31

Small-ruleset regular expression matching on GPGPUs: quantitative performance analysis and optimization

We explore the intersection between an emerging class of architectures and a prominent workload: GPGPUs (General-Purpose Graphics Processing Units) and regular expression matching, respectively. It is a challenging task because this workload — with its irregular, non-coalesceable memory access patterns — is very different from the regular, numerical workloads that run efficiently on GPGPUs. Small-ruleset […]

CUDA

Jan, 30

A Novel Monte Carlo Noise Reduction Operator

We propose a novel Monte Carlo noise reduction operator in this article. We apply and extend the standard bilateral filtering method and build a new local adaptive noise reduction kernel. It first computes an initial estimate for the value of each pixel, and then applies bilateral filtering using this initial estimate in its range filter […]

Jan, 30

Geometry Textures and Applications

Geometry textures are a novel geometric representation for surfaces based on height maps. The visualization is done through a graphics processing unit (GPU) ray casting algorithm applied to the whole object. At rendering time, the fine-scale details (mesostructures) are reconstructed preserving original quality. Visualizing surfaces with geometry textures allows a natural level-of-detail (LOD) behaviour. There […]

OpenGL

Jan, 30

Real-Time Depth-of-Field Rendering Using Point Splatting on Per-Pixel Layers

We present a real-time method for rendering a depth-of-field effect based on the per-pixel layered splatting where source pixels are scattered on one of the three layers of a destination pixel. In addition, the missing information behind foreground objects is filled with an additional image of the areas occluded by nearer objects. The method creates […]

OpenGL

Jan, 30

Efficient image reconstruction for point-based and line-based rendering

We address the problem of an efficient image-space reconstruction of adaptively sampled scenes in the context of point-based and line-based graphics. The image-space reconstruction offers an advantageous time complexity compared to surface splatting techniques and, in fact, our improved GPU implementation performs significantly better than splatting implementations for large point-based models. We discuss the integration […]

OpenGL

Jan, 30

Fast and Scalable CPU/GPU Collision Detection for Rigid and Deformable Surfaces

We present a new hybrid CPU/GPU collision detection technique for rigid and deformable objects based on spatial subdivision. Our approach efficiently exploits the massive computational capabilities of modern CPUs and GPUs commonly found in off-the-shelf computer systems. The algorithm is specifically tailored to be highly scalable on both the CPU and the GPU sides. We […]

CUDA

Jan, 30

Feature based terrain generation using diffusion equation

This paper presents a diffusion method for generating terrains from a set of parameterized curves that characterize the landform features such as ridge lines, riverbeds or cliffs. Our approach provides the user with an intuitive vector-based feature-oriented control over the terrain. Different types of constraints (such as elevation, slope angle and roughness) can be attached […]

OpenGL

Jan, 30

Asynchronous Communication Schemes for Finite Difference Methods on Multiple GPUs

Finite difference methods continue to provide an important and parallelisable approach to many numerical simulations problems. Iterative multigrid and multilevel algorithms can converge faster than ordinary finite difference methods but can be more difficult to parallelise. Data parallel paradigms tend to lend themselves particularly well to solving regular mesh PDEs whereby low latency communications and […]

CUDA

Jan, 30

Accelerating marching cubes with graphics hardware

Medical imaging and scientific simulation produce large volumetric datasets, which are often visualized by isosurface extraction and rendering. This extracts individual surfaces from the volume to represent significant boundaries in the volume. The standard isosurface extraction method, Marching Cubes, generates a triangulated approximation of the isosurface one cell at a time.We contribute improvements with respect […]

OpenGL

Jan, 30

Implicit Boundary Control of Vector Field Based Shape Deformations

We present a shape deformation approach which preserves volume, prevents self-intersections and allows for exact control of the deformation impact. The volume preservation and prevention of self-intersections are achieved by utilizing the method of Vector Field Based Shape Deformations. This method produces physically plausible deformations efficiently by integrating formally constructed divergence-free vector fields, where the […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Highly interactive computational steering for coupled 3D flow problems utilizing multiple GPUs

Top-Performance Tokenization and Small-Ruleset Regular Expression Matching: A Quantitative Performance Analysis and Optimization Study on the Cell/B.E. Processor

Small-ruleset regular expression matching on GPGPUs: quantitative performance analysis and optimization

A Novel Monte Carlo Noise Reduction Operator

Geometry Textures and Applications

Real-Time Depth-of-Field Rendering Using Point Splatting on Per-Pixel Layers

Efficient image reconstruction for point-based and line-based rendering

Fast and Scalable CPU/GPU Collision Detection for Rigid and Deformable Surfaces

Feature based terrain generation using diffusion equation

Asynchronous Communication Schemes for Finite Difference Methods on Multiple GPUs

Accelerating marching cubes with graphics hardware

Implicit Boundary Control of Vector Field Based Shape Deformations

Recent source codes

AutoDock-GPU: AutoDock for GPUs and other accelerators

NCCLX: collective communication framework

Tutoring LLM into a Better CUDA Optimizer

Kernel Library for LLM Serving

Adaptivity in AdaptiveCpp: Optimizing Performance by Leveraging Runtime Information During JIT-Compilation

Neptune: Advanced ML Operator Fusion for Locality and Parallelism on GPUs

Genten: Software for Generalized Tensor Decompositions by Sandia National Laboratories

Interleaved Learning and Exploration: A Self-Adaptive Fuzz Testing Framework for MLIR

Pinocchio: PINpointing Orbit Crossing Collapsed Hierarchical Objects

KernelCoder: trained on a curated dataset of reasoning traces and CUDA kernel pairs

Most viewed papers (last 30 days)