Posts
Jun, 20
Reconfigurable Control Variate Monte-Carlo Designs for Pricing Exotic Options
Exotic options are financial derivatives which have complex features including path-dependency. These complex features make them difficult to price, as only computationally intensive Monte-Carlo methods can provide accurate prices. This paper proposes an FPGA-accelerated control variate Monte-Carlo (CVMC) framework for pricing exotic options. An optimised implementation of arithmetic Asian option pricing under this framework in […]
Jun, 20
Lattice-based flow field modeling
We present an approach for simulating the natural dynamics that emerge from the interaction between a flow field and immersed objects. We model the flow field using the lattice Boltzmann model (LBM) with boundary conditions appropriate for moving objects and accelerate the computation on commodity graphics hardware (GPU) to achieve real-time performance. The boundary conditions […]
Jun, 20
Implementing Sparse Matrix-Vector multiplication using CUDA based on a hybrid sparse matrix format
The Sparse Matrix-Vector product (SpMV) is a key operation in engineering and scientific computing. Methods for efficiently implementing it in parallel are critical to the performance of many applications. Modern Graphics Processing Units (GPUs) coupled with the advent of general purpose programming environments like NVIDIA’s CUDA, have gained interest as a viable architecture for data-parallel […]
Jun, 20
Second Order Pre-Integrated Volume Rendering
In the field of Volume Rendering, the pre-integration of arbitrary transfer functions has certainly led to the most significant and convincing results both quality and performance wise, allowing high quality visualization on standard PC consumer graphics. By showing that the ideal scalar signal along the cast rays is better approximated by a succession of polynomial […]
Jun, 20
GPUs for fast triggering and pattern matching at the CERN experiment NA62
In high energy physics experiment the trigger system is crucial to reduce the quantity of data recorded on tape and the acquisition bandwidth requirements. This is particularly true in rare decays experiments. The NA62 experiment aims at measuring the branching ratio of K^+->pi^+nu bar{nu}, predicted in the standard model (SM) at level of ~10^(-10). In […]
Jun, 20
Large-Scale Stereo Display Wall Using Programmable Graphics Hardware
In this paper, we present an large-scale stereo display wall system for tangible telemeeting using programmable graphics hardware. For tangible telemeeting, it is important to provide immersive display with high resolution image to cover up the field of view and provide to the local user the same environment as that of remote site. To achieve […]
Jun, 20
Efficient Surface Reconstruction From Noisy Data Using Regularized Membrane Potentials
A physically motivated method for surface reconstruction is proposed that can recover smooth surfaces from noisy and sparse data sets. No orientation information is required. By a new technique based on regularized-membrane potentials the input sample points are aggregated, leading to improved noise tolerability and outlier removal, without sacrificing much with respect to detail (feature) […]
Jun, 20
Solving 2D Nonlinear Unsteady Convection-Diffusion Equations on Heterogenous Platforms with Multiple GPUs
Solving complex convection-diffusion equations is very important to many practical mathematical and physical problems. After the finite difference discretization, most of the time for equations solution is spent on sparse linear equation solvers. In this paper, our goal is to solve 2D Nonlinear Unsteady Convection-Diffusion Equations by accelerating an iterative algorithm named Jacobi-preconditioned QMRCGSTAB on […]
Jun, 20
Accelerating batched 1D-FFT with a CUDA-capable computer
This work concerns the application of CUDA-based software (Compute Unified Device Architecture), developed by NVIDIA for programmable Graphics Processing units (GPUs). CUDA code is written in ‘C for CUDA’, indicating the standard C programming language with NVIDIA extensions.Our goal was to find out, whether batched (multiple) one-dimensional Fast Fourier Transformation (1DFFT), often encountered in various […]
Jun, 20
Distance field transform with an adaptive iteration method
We propose a novel distance field transform method based on an iterative method adaptively performed on an evolving active band. Our method utilizes a narrow band to store active grid points being computed. Unlike the conventional fast marching method, we do not maintain a priority queue, and instead, perform iterative computing inside the band. This […]
Jun, 20
Image-Based Material Restyling with Fast Non-local Means Filtering
This paper presents a new GPU-based implementation of fast non-local means (NLM) filtering for material restyling. Our fast NLM filtering algorithm is able to achive realtime feedback of interactive image editing. Furthermore a novel material editing method based on our fast NLM filtering is proposed to change the material appearance of image-based objects. Given an […]
Jun, 20
A Parallel Streaming Motion Estimation for Real-Time HD H.264 Encoding on Programmable Processors
Motion estimation is an important computing intensive component in most video compression standards. The high computational costs and heavy memory bandwidth requirements of motion estimation give huge pressure to most existing programmable processors, especially in real-time high definition H.264 video encoding. Emerging stream processing model supported by most programmable processors provide a powerful mechanism to […]

