Posts
Jun, 21
Performance evaluation of the multi-device OpenCL FDTD solver
We present results of an evaluation of a multi-device OpenCL FDTD solver. Portability between hardware manufactured by different vendors and also between highly specialized and parallel computing architectures available on the market, i.e. GPUs, multi-core CPUs and devices integrating both technologies in a single-die IC, is the main advantage of this solver. For code execution […]
Jun, 21
Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth
Stencil computation is one of the important kernels in scientific computations, however, the sustained performance is limited by memory bandwidth especially on multi-core microprocessors and GPGPUs due to its small operationalintensity. In this paper, we propose a scalable streaming-array (SSA) of simple soft-processors for high-performance stencil computation on multiple FPGAs. The SSA architecture allows a […]
Jun, 21
Protein alignment algorithms with an efficient backtracking routine on multiple GPUs
BACKGROUND: Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a […]
Jun, 21
Fast, parallel, GPU-based construction of space filling curves and octrees
Space Filling Curves (SFC) are particularly useful in linearization of data living in two and three dimensional spaces and have been used in a number of applications in scientific computing, and visualization. Interestingly, octrees, another versatile data structure in computer graphics, can be viewed as multiple SFCs at varying resolutions, albeit with parent-child relationship. In […]
Jun, 20
Neural network modeling on evolution of hydration reaction for Portland cement
The hydration reaction of Portland cement paste has an important impact on the formation of microstructure and development of strength. However, simulating the evolution of hydration reaction is very difficult because there are multi-phased, multi-sized and interrelated complex chemical and physical reactions during cement hydration. In this paper, a feedforward neural network model is built […]
Jun, 20
A Scalable End-to-End Optimized Real-Time Image-Based Rendering Framework on Graphics Hardware
This paper presents the system-level overview of a real-time image- based rendering framework performing multiple intermediate view synthesis, completely on the Graphics Processing Unit (GPU). The software design achieves high-performance, yet maintains flexibility and ease of development through a hierarchical layered architecture. The framework implements the intermediate view synthesis by a chain of consecutive processing […]
Jun, 20
Reconfigurable Control Variate Monte-Carlo Designs for Pricing Exotic Options
Exotic options are financial derivatives which have complex features including path-dependency. These complex features make them difficult to price, as only computationally intensive Monte-Carlo methods can provide accurate prices. This paper proposes an FPGA-accelerated control variate Monte-Carlo (CVMC) framework for pricing exotic options. An optimised implementation of arithmetic Asian option pricing under this framework in […]
Jun, 20
Lattice-based flow field modeling
We present an approach for simulating the natural dynamics that emerge from the interaction between a flow field and immersed objects. We model the flow field using the lattice Boltzmann model (LBM) with boundary conditions appropriate for moving objects and accelerate the computation on commodity graphics hardware (GPU) to achieve real-time performance. The boundary conditions […]
Jun, 20
Implementing Sparse Matrix-Vector multiplication using CUDA based on a hybrid sparse matrix format
The Sparse Matrix-Vector product (SpMV) is a key operation in engineering and scientific computing. Methods for efficiently implementing it in parallel are critical to the performance of many applications. Modern Graphics Processing Units (GPUs) coupled with the advent of general purpose programming environments like NVIDIA’s CUDA, have gained interest as a viable architecture for data-parallel […]
Jun, 20
Second Order Pre-Integrated Volume Rendering
In the field of Volume Rendering, the pre-integration of arbitrary transfer functions has certainly led to the most significant and convincing results both quality and performance wise, allowing high quality visualization on standard PC consumer graphics. By showing that the ideal scalar signal along the cast rays is better approximated by a succession of polynomial […]
Jun, 20
GPUs for fast triggering and pattern matching at the CERN experiment NA62
In high energy physics experiment the trigger system is crucial to reduce the quantity of data recorded on tape and the acquisition bandwidth requirements. This is particularly true in rare decays experiments. The NA62 experiment aims at measuring the branching ratio of K^+->pi^+nu bar{nu}, predicted in the standard model (SM) at level of ~10^(-10). In […]
Jun, 20
Large-Scale Stereo Display Wall Using Programmable Graphics Hardware
In this paper, we present an large-scale stereo display wall system for tangible telemeeting using programmable graphics hardware. For tangible telemeeting, it is important to provide immersive display with high resolution image to cover up the field of view and provide to the local user the same environment as that of remote site. To achieve […]