Posts
Dec, 2
Effective GPU Strategies for LU Decomposition
GPUs are becoming an attractive computing platform not only for traditional graphics computation but also for general-purpose computation because of the computational power, programmability and comparatively low cost of modern GPUs. This has lead to a variety of complex GPGPU applications with significant performance improvements. The LU decomposition represents a fundamental step in many computationally […]
Dec, 2
Parallel-META: A high-performance computational pipeline for metagenomic data analysis
Metagenomics method directly sequences and analyzes genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomics data analysis include taxonomical and functional component of these genomes in the microbial community. Metagenomic data analysis is both data- and […]
Dec, 2
Efficient Cubic B-spline Image Interpolation on a GPU
Application of geometric transformation to images requires an interpolation step. When applied to image rotation, the presently most efficient GPU implementation for the cubic spline image interpolation still cost about 10 times as much as linear interpolation. This implementation involves two steps: a prefilter step performs a two-pass forward-backward recursive filter, then a cubic polynomial […]
Dec, 2
Massively Parallelized Monte Carlo Simulation and its Applications in Finance
In this paper, we propose, develop and implement a tool that increases the computational speed of exotic derivatives pricing at a fraction of the cost of traditional methods. Our paper focuses on investigating the computing efficiencies of GPU systems. We utilize the GPU’s natural parallelization capabilities to price financial instruments. We outline our implementation, solutions […]
Dec, 2
An error correction solver for linear systems: Evaluation of mixed precision implementations
This paper proposes an error correction method for solving linear systems of equations and the evaluation of an implementation using mixed precision techniques. While different technologies are available, graphic processing units (GPUs) have been established as particularly powerful coprocessors in recent years. For this reason, our error correction approach is focused on a CUDA implementation […]
Dec, 2
Auto-optimization of a Feature Selection Algorithm
Advanced visualization algorithms are typically computationally expensive but highly data parallel which make them attractive candidates for GPU architectures. However, porting algorithms on a GPU still remains a challenging process. The Mint programming model addresses this issue with its simple and high level interface. It targets the users who seek real-time performance without investing in […]
Dec, 2
Evaluation of Fermi Features for Data Mining Algorithms
A recent development in High Performance Computing is the availability of NVIDIA’s Fermi or the 20-series GPUs. These offer features such as inbuilt atomic double precision support and increased shared memory. This thesis focuses on optimizing and evaluating the new features offered by the Fermi series GPUs for data mining algorithms involving reductions. Using three […]
Dec, 2
Implementation of the FDTD Method Based on Lorentz-Drude Dispersive Model on GPU for Plasmonics Applications
We present a three-dimensional finite difference time domain (FDTD) method on graphics processing unit (GPU) for plasmonics applications. For the simulation of plasmonics devices, the Lorentz-Drude (LD) dispersive model is incorporated into Maxwell equations, while the auxiliary differential equation (ADE) technique is applied to the LD model. Our numerical experiments based on typical domain sizes […]
Dec, 2
Spotting Radio Transients with the help of GPUs
Exploration of the time-domain radio sky has huge potential for advancing our knowledge of the dynamic universe. Past surveys have discovered large numbers of pulsars, rotating radio transients and other transient radio phenomena; however, they have typically relied upon off-line processing to cope with the high data and processing rate. This paradigm rules out the […]
Dec, 1
A programming language interface to describe transformations and code generation
This paper presents a programming language interface, a complete scripting language, to describe composable compiler transformations. These transformation programs can be written, shared and reused by non-expert application and library developers. From a compiler writer’s perspective, a scripting language interface permits rapid prototyping of compiler algorithms that can mix levels and compose different sequences of […]
Dec, 1
GPU Acceleration of Solving Parabolic Partial Differential Equations Using Difference Equations
Parabolic partial differential equations are often used to model systems involving heat transfer, acoustics, and electrostatics. The need for more complex models with increasing precision drives greater computational demands from processors. Since solving these types of equations is inherently parallel, GPU computing offers an attractive solution for drastically decreasing time to completion, power usage, and […]
Dec, 1
Scalable Data Clustering using GPU Clusters
The computational demands of multivariate clustering grow rapidly, and therefore processing large data sets, like those found in flow cytometry data, is very time consuming on a single CPU. Fortunately these techniques lend themselves naturally to large scale parallel processing. To address the computational demands, graphics processing units, specifically NVIDIA’s CUDA framework and Tesla architecture, […]