high performance computing on graphics processing units: hgpu.org

Posts

Aug, 5

Reconstruction and visualization of planetary nebulae

From our terrestrially confined viewpoint, the actual three-dimensional shape of distant astronomical objects is, in general, very challenging to determine. For one class of astronomical objects, however, spatial structure can be recovered from conventional 2D images alone. So-called planetary nebulae (PNe) exhibit pronounced symmetry characteristics that come about due to fundamental physical processes. Making use […]

Aug, 5

Accelerating wavelet-based video coding on graphics hardware using CUDA

The discrete wavelet transform (DWT) has a wide range of applications from signal processing to video and image compression. This transform, by means of the lifting scheme, can be performed in a memory and computation efficient way on modern, programmable GPUs, which can be regarded as massively parallel co-processors through NVidia’s CUDA compute paradigm. The […]

CUDA

Aug, 5

IRIS: Illustrative Rendering for Integral Surfaces

Integral surfaces are ideal tools to illustrate vector fields and fluid flow structures. However, these surfaces can be visually complex and exhibit difficult geometric properties, owing to strong stretching, shearing and folding of the flow from which they are derived. Many techniques for non-photorealistic rendering have been presented previously. It is, however, unclear how these […]

OpenGL

Aug, 5

Constrained inverse volume rendering for planetary nebulae

Determining the three-dimensional structure of distant astronomical objects is a challenging task, given that terrestrial observations provide only one viewpoint. For this task, bipolar planetary nebulae are interesting objects of study because of their pronounced axial symmetry due to fundamental physical processes. Making use of this symmetry constraint, we present a technique to automatically recover […]

Aug, 5

Depth Images: Representations and Real-Time Rendering

Depth Images are viable representations that can be computed from the real world using cameras and/or other scanning devices. The depth map provides 2-kD structure of the scene. A set of Depth Images can provide hole-free rendering of the scene. Multiple views need to blended to provide smooth hole-free rendering, however. Such a representation of […]

OpenGL

Aug, 5

Interactive Level-of-Detail Selection Using Image-Based Quality Metric for Large Volume Visualization

For large volume visualization, an image-based quality metric is difficult to incorporate for level-of-detail selection and rendering without sacrificing the interactivity. This is because it is usually time-consuming to update view-dependent information as well as to adjust to transfer function changes. In this paper, we introduce an image-based level-of-detail selection algorithm for interactive visualization of […]

OpenGL

Aug, 5

An optimised multi-baseline approach for on-line MR-temperature monitoring on commodity graphics hardware

Magnetic resonance Imaging (MRI) can be used for non invasive temperature mapping and is therefore a promising tool to monitor and control interventional therapies based on thermal ablation. The proton resonance frequency shift MRI technique gives an estimate of the temperature by comparing phase changes between dynamically acquired images. These temperature measurements are prone to […]

CUDA

Aug, 5

Petascale visualization: Approaches and initial results

With the advent of the first petascale supercomputer, Los Alamos’s Roadrunner, there is a pressing need to address how to visualize petascale data. The crux of the petascale visualization performance problem is interactive rendering, since it is the most computationally intensive portion of the visualization process. For terascale platforms, commodity clusters with graphics processors (GPUs) […]

OpenGL

Aug, 4

A parallel decoding algorithm of LDPC codes using CUDA

A parallel belief propagation algorithm for decoding low-density parity-check (LDPC) codes is presented in this paper based on Compute Unified Device Architecture (CUDA). As a new hardware and software architecture for addressing and managing computations, CUDA offers parallel data computing using the highly multithreaded coprocessor driven by very high memory bandwidth GPU. The parallel decoding […]

CUDA

Aug, 4

Image-Based Proxy Accumulation for Real-Time Soft Global Illumination

We present a new, general, and real-time technique for soft global illumination in low-frequency environmental lighting. It accumulates over relatively few spherical proxies which approximate the light blocking and re-radiating effect of dynamic geometry. Soft shadows are computed by accumulating log visibility vectors for each sphere proxy as seen by each receiver point. Inter-reflections are […]

Aug, 4

Tradeoffs in designing accelerator architectures for visual computing

Visualization, interaction, and simulation (VIS) constitute a class of applications that is growing in importance. This class includes applications such as graphics rendering, video encoding, simulation, and computer vision. These applications are ideally suited for accelerators because of their parallelizability and demand for high throughput. We compile a benchmark suite, VIS- Bench, to serve as […]

Aug, 4

Global Illumination for Interactive Lighting Design Using Light Path Pre-Computation and Hierarchical Histogram Estimation

In this paper, we propose a fast global illumination solution for interactive lighting design. Using our method, light sources and the viewpoint are movable, and the characteristics of materials can be modified (assuming low-frequency BRDF) during rendering. Our solution is based on particle tracing (a variation of photon mapping) and final gathering. We assume that […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Reconstruction and visualization of planetary nebulae

Accelerating wavelet-based video coding on graphics hardware using CUDA

IRIS: Illustrative Rendering for Integral Surfaces

Constrained inverse volume rendering for planetary nebulae

Depth Images: Representations and Real-Time Rendering

Interactive Level-of-Detail Selection Using Image-Based Quality Metric for Large Volume Visualization

An optimised multi-baseline approach for on-line MR-temperature monitoring on commodity graphics hardware

Petascale visualization: Approaches and initial results

A parallel decoding algorithm of LDPC codes using CUDA

Image-Based Proxy Accumulation for Real-Time Soft Global Illumination

Tradeoffs in designing accelerator architectures for visual computing

Global Illumination for Interactive Lighting Design Using Light Path Pre-Computation and Hierarchical Histogram Estimation

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)