1759

Posts

Nov, 23

Graphical Asian Options

We discuss the problem of pricing Asian options in Black-Scholes model using CUDA on a graphics processing unit. We survey some of the issues with GPU programming and discuss code design and memory usage. We show that by using a Quasi Monte Carlo simulation with a geometric Asian option as a control variate, it is […]
Nov, 23

Challenges and opportunities of obtaining performance from multi-core CPUs and many-core GPUs

Multi-core processors represent a major development in computing technology. For example, Intel Coretrade 2 Quad processors, IBM Cell processors, and Nvidia GeForce 9800 GX2, are widely used. However, most applications struggle to make the best use of the power provided by many-core processors. Easy-to-use software tools are hard to find. Furthermore, it’s not clear what […]
Nov, 23

Implementation of a Lattice Boltzmann kernel using the Compute Unified Device Architecture developed by nVIDIA

In this article a very efficient implementation of a 2D-Lattice Boltzmann kernel using the Compute Unified Device Architecture (CUDA) interface developed by nVIDIA is presented. By exploiting the explicit parallelism exposed in the graphics hardware we obtain more than one order in performance gain compared to standard CPUs. A non-trivial example, the flow through a […]
Nov, 23

Efficient Probabilistic Model Checking on General Purpose Graphics Processors

We present algorithms for parallel probabilistic model checking on general purpose graphic processing units (GPGPUs). For this purpose we exploit the fact that some of the basic algorithms for probabilistic model checking rely on matrix vector multiplication. Since this kind of linear algebraic operations are implemented very efficiently on GPGPUs, the new parallel algorithms can […]
Nov, 23

Parallel computation of mutual information on the GPU with application to real-time registration of 3D medical images

Due to processing constraints, automatic image-based registration of medical images has been largely used as a pre-operative tool. We propose a novel method named sort and count for efficient parallelization of mutual information (MI) computation designed for massively multi-processing architectures. Combined with a parallel transformation implementation and an improved optimization algorithm, our method achieves real-time […]
Nov, 23

Mapping High-Fidelity Volume Rendering for Medical Imaging to CPU, GPU and Many-Core Architectures

Medical volumetric imaging requires high fidelity, high performance rendering algorithms. We motivate and analyze new volumetric rendering algorithms that are suited to modern parallel processing architectures. First, we describe the three major categories of volume rendering algorithms and confirm through an imaging scientist-guided evaluation that ray-casting is the most acceptable. We describe a thread- and […]
Nov, 23

Fast perspective volume ray casting method using GPU-based acceleration techniques for translucency rendering in 3D endoluminal CT colonography

Recent advances in graphics processing unit (GPU) have enabled direct volume rendering at interactive rates. However, although perspective volume rendering for opaque isosurface is rapidly performed using conventional GPU-based method, perspective volume rendering for non-opaque volume such as translucency rendering is still slow. In this paper, we propose an efficient GPU-based acceleration technique of fast […]
Nov, 23

hiCUDA: a high-level directive-based language for GPU programming

The Compute Unified Device Architecture (CUDA) has become a de facto standard for programming NVIDIA GPUs. However, CUDA places on the programmer the burden of packaging GPU code in separate functions, of explicitly managing data transfer between the host memory and various components of the GPU memory, and of manually optimizing the utilization of the […]
Nov, 23

Acceleration of a QM/MM-QMC simulation using GPU

We accelerated an ab-initio molecular QMC calculation by using GPGPU. Only the bottle-neck part of the calculation is replaced by CUDA subroutine and performed on GPU, getting 23.5 (9.7) times faster performance in single (double) precision. The energy deviation caused by the single precision treatment was found to be within the accuracy required in the […]
Nov, 22

GPU-based cone beam computed tomography

The use of cone beam computed tomography (CBCT) is growing in the clinical arena due to its ability to provide 3D information during interventions, its high diagnostic quality (sub-millimeter resolution), and its short scanning times (60 s). In many situations, the short scanning time of CBCT is followed by a time-consuming 3D reconstruction. The standard […]
Nov, 22

Efficient and Accurate Sound Propagation Using Adaptive Rectangular Decomposition

Accurate sound rendering can add significant realism to complement visual display in interactive applications, as well as facilitate acoustic predictions for many engineering applications, like accurate acoustic analysis for architectural design. Numerical simulation can provide this realism most naturally by modeling the underlying physics of wave propagation. However, wave simulation has traditionally posed a tough […]
Nov, 22

3D nonrigid registration via optimal mass transport on the GPU

In this paper, we present a new computationally efficient numerical scheme for the minimizing flow approach for optimal mass transport (OMT) with applications to non-rigid 3D image registration. The approach utilizes all of the gray-scale data in both images, and the optimal mapping from image A to image B is the inverse of the optimal […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: