high performance computing on graphics processing units: hgpu.org

Posts

Jun, 26

A Low-Cost Solution For Excavator Simulation With Realistic Visual Effect

A low-cost excavator simulator has been developed in this paper for training human operators and evaluating control strategies for heavy-duty hydraulic machines. In such a system, the operator controls a virtual excavator by means of a joystick while experiencing realistic operating feelings through force feedback, graphical displays, and sound effects in virtual operating environments. A […]

Jun, 26

Importance-Driven Particle Techniques for Flow Visualization

Particle tracing has been established as a powerful visualization technique to show the dynamics of 3D flows. Particle tracing in 3D, however, quickly overextends the viewer due to the massive amount of visual information that is typically produced by this technique. In this paper, we present strategies to reduce this amount at the same time […]

Jun, 26

Visual Simulation of Breaking Waves in Shallow Water

We introduce a composite method for simulation and rendering of breaking waves in computer graphics applications. The method generated not only a simple wave belt like previous work, but also a manipulative model by influence of the wind. The particle system we adopted combine POINT primitive as basic particle and fat particle as characteristics particle […]

Jun, 26

Kd-Jump: a Path-Preserving Stackless Traversal for Faster Isosurface Raytracing on GPUs

Stackless traversal techniques are often used to circumvent memory bottlenecks by avoiding a stack and replacing return traversal with extra computation. This paper addresses whether the stackless traversal approaches are useful on newer hardware and technology (such as CUDA). To this end, we present a novel stackless approach for implicit kd-trees, which exploits the benefits […]

CUDA

Jun, 26

A real-time coarse-to-fine multiview capture system for all-in-focus rendering on a light-field display

We present an end-to-end system capable of real-time capturing and displaying with full horizontal parallax high-quality 3D video contents on a cluster-driven multiprojector light-field display. The capture component is an array of low-cost USB cameras connected to a single PC. RawM-JPEG data coming fromthe software-synchronized cameras are multicast over Gigabit Ethernet to the back-end nodes […]

CUDA

•

OpenGL

Jun, 26

Acceleration of the Method of Moments Calculations by Using Graphics Processing Units

The graphics processing unit (GPU) has been used to speed up the conventional method of moments (MoM) calculations for electromagnetic scattering from arbitrary three-dimensional conducting objects. The acceleration ratio of filling impedance matrix has reached 30, while the total acceleration ratio (including iteration) is about 20. Moreover, a matrix splitting algorithm is developed to break […]

Jun, 26

CUDA-based acceleration and algorithm refinement for volume image registration

In this paper, we propose a GPU-based acceleration method to speed up volume image registration using Compute Unified Device Architecture(CUDA). A novel CUDA-based method for joint histogram computation is introduced in this paper, which is also valuable for 2D image registration and other general graphics applications. Additionally, an algorithm refinement is proposed to improve the […]

CUDA

Jun, 25

Parallel Processing for Normal Mixture Models of Hyperspectral Data Using a Graphics Processor

Multivariate normal mixture models, where a complex statistical distribution is represented by a weighted sum of several multivariate normal probability distributions, have many potential applications including anomaly detection (AD) in hyperspectral (HS) images. The high computational cost of mixture models requires hardware and/or algorithmic acceleration to make AD run in real time. In this paper […]

CUDA

Jun, 25

Ringing: Frugal Subdivision of Curves and Surfaces

Ringing reduces the working memory needed during the rendering of subdivision curves, surfaces, and animations. For example, it only stores 4D vertices when rendering the dth level of subdivision of a polygon using four-point, cubic, or quintic B-spline subdivision masks. Ringing can be used on CPUs and GPUs.

Jun, 25

Fast High-Quality Volume Ray Casting with Virtual Samplings

Volume ray-casting with a higher order reconstruction filter and/or a higher sampling rate has been adopted in direct volume rendering frameworks to provide a smooth reconstruction of the volume scalar and/or to reduce artifacts when the combined frequency of the volume and transfer function is high. While it enables high-quality volume rendering, it cannot support […]

Jun, 25

Hierarchical Line Integration

This paper presents an acceleration scheme for the numerical computation of sets of trajectories in vector fields or iterated solutions in maps, possibly with simultaneous evaluation of quantities along the curves such as integrals or extrema. It addresses cases with a dense evaluation on the domain, where straightforward approaches are subject to redundant calculations. These […]

CUDA

Jun, 25

Simulation of shallow water based on shader

We introduce a method for simulate and render of ocean wave by vertex and fragment processing. The method is divided into four stages: wave generation, surface tessellation, optical simulation, and water surface rendering. Our implement not only is a simple wave by influence of terrain like previous work, but also a manipulative model by influence […]

high performance computing on graphics processing units: hgpu.org

Posts

A Low-Cost Solution For Excavator Simulation With Realistic Visual Effect

Importance-Driven Particle Techniques for Flow Visualization

Visual Simulation of Breaking Waves in Shallow Water

Kd-Jump: a Path-Preserving Stackless Traversal for Faster Isosurface Raytracing on GPUs

A real-time coarse-to-fine multiview capture system for all-in-focus rendering on a light-field display

Acceleration of the Method of Moments Calculations by Using Graphics Processing Units

CUDA-based acceleration and algorithm refinement for volume image registration

Parallel Processing for Normal Mixture Models of Hyperspectral Data Using a Graphics Processor

Ringing: Frugal Subdivision of Curves and Surfaces

Fast High-Quality Volume Ray Casting with Virtual Samplings

Hierarchical Line Integration

Simulation of shallow water based on shader

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)