8881

Posts

Jan, 17

Application of GPU Smooth Particle Hydrodynamics: Wave Runup and Overtopping on Composite Slopes

Smooth Particle Hydrodynamics is a Lagrangian meshless numerical method with substantially improved capabilities in simulation of both fluid dynamics and solid mechanics due to its meshless nature. GPUSPH is an implementation of Smoothed Particle Hydrodynamics (SPH) on Nvidia CUDA-enabled (graphics) cards. In this paper the GPUSPH is applied to runup and overtopping applications and compared […]
Jan, 17

Bouncing Behavior of Microscopic Dust Aggregates

CONTEXT: Bouncing collisions of dust aggregates within the protoplanetary may have a significant impact on the growth process of planetesimals. Yet, the conditions that result in bouncing are not very well understood. Existing simulations studying the bouncing behavior used aggregates with an artificial, very regular internal structure. Aims: Here, we study the bouncing behavior of […]
Jan, 16

Burrows-Wheeler Aligner: A Parallel Approach

The advent of mainframe computing brought about a fundamentally different way of approaching problems for many branches of science. But none has transformed quite like the science of biology. With genome sequencing now commonplace, an organism may be completely represented as a sequence of numbers. Harnessing the power of computers, sequences of genomes (and thus […]
Jan, 16

A parallel implementation of a derivative pricing model incorporating SABR calibration and probability lookup tables

We describe a high performance parallel implementation of a derivative pricing model, within which we introduce a new parallel method for the calibration of the industry standard SABR (stochastic-alpha beta rho) stochastic volatility model using three strike inputs. SABR calibration involves a non-linear three dimensional minimisation and parallelisation is achieved by incorporating several assumptions unique […]
Jan, 16

Efficient implementation of multiuser precoding algorithms on GPU for MIMO-OFDM systems

In this paper, we focus on the signal precoding stage in multiuser multicarrier systems, which can be often a computationally expensive task. In order to reduce their computational time, the implementation of some of the most employed multiuser precoding algorithms on a general purpose Graphic Processing Unit (GPU) is presented. These devices allow for a […]
Jan, 16

On the Use of Graphic Processing Units for the Efficient Implementation of MIMO Detectors

The use of Graphic Processing Units (GPU) for the efficient implementation of signal processing algorithms for MIMO communication systems is receiving incremental attention recently. This is mainly due to their high capability of parallel processing together with their reasonable cost. In this work, the interest of GPU for the rapid prototyping of MIMO receivers is […]
Jan, 16

Analytic Visibility on the GPU

This paper presents a parallel, implementation-friendly analytic visibility method for triangular meshes. Together with an analytic filter convolution, it allows for a fully analytic solution to anti-aliased 3D mesh rendering on parallel hardware. Building on recent works in computational geometry, we present a new edge-triangle intersection algorithm and a novel method to complete the boundaries […]
Jan, 15

Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels

We study semantics of GPU kernels – the parallel programs that run on Graphics Processing Units (GPUs). We provide a novel lock-step execution semantics for GPU kernels represented by arbitrary reducible control flow graphs and compare this semantics with a traditional interleaving semantics. We show for terminating kernels that either both semantics compute identical results […]
Jan, 15

Three Dimensional Fast Fourier Transform CUDA Implementation

A 3 dimensional DFT can be expressed as 3 DFTs on a 3 dimensional data along each dimension. Each of these 1 dimensional DFTs can be computed efficiently owing to the properties of the transform. This class of algorithms is known as the Fast Fourier Transform (FFT). We introduce the one dimensional FFT algorithm in […]
Jan, 15

Two Approaches to Particle Simulation: OpenMPI and CUDA

In this project, our goal is to make particle simulation run in parallel using both CUDA and MPI implementations. The scale of parallelism between the are two vastly different, and we wish to compare implementations to gain insight on the strengths and weaknesses between the two different paradigms of parallelization. Particle simulations are of huge […]
Jan, 15

Ray Tracing in Real-Time Games

This thesis describes efficient rendering algorithms based on ray tracing, and the application of these algorithms to real-time games. Compared to rasterizationbased approaches, rendering based on ray tracing allows elegant and correct simulation of important global effects, such as shadows, reflections and refractions. The price for these benefits is performance: ray tracing is compute-intensive. This […]
Jan, 15

Massively parallelizable list-mode reconstruction using a Monte Carlo-based elliptical Gaussian model

PURPOSE: A fully three-dimensional (3D) massively parallelizable list-mode ordered-subsets expectation-maximization (LM-OSEM) reconstruction algorithm has been developed for high-resolution PET cameras. System response probabilities are calculated online from a set of parameters derived from Monte Carlo simulations. The shape of a system response for a given line of response (LOR) has been shown to be asymmetrical […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org