Posts
Dec, 3
EM+TV for Reconstruction of Cone-beam CT with Curved Detectors using GPU
Computerized tomography (CT) plays a critical role in the practice of modern medicine. However, the radiation associated with CT is significant. Methods that can enable CT imaging at reduced radiation exposure without sacrificing image quality are therefore extremely important. This paper introduces a novel method for enabling improved reconstruction at lower radiation exposure levels. The […]
Dec, 2
On the design of architecture-aware algorithms for emerging applications
This dissertation maps various kernels and applications to a spectrum of programming models and architectures and also presents architecture-aware algorithms for different systems. The kernels and applications discussed in this dissertation have widely varying computational characteristics. For example, we consider both dense numerical computations and sparse graph algorithms. This dissertation also covers emerging applications from […]
Dec, 2
Effective GPU Strategies for LU Decomposition
GPUs are becoming an attractive computing platform not only for traditional graphics computation but also for general-purpose computation because of the computational power, programmability and comparatively low cost of modern GPUs. This has lead to a variety of complex GPGPU applications with significant performance improvements. The LU decomposition represents a fundamental step in many computationally […]
Dec, 2
Parallel-META: A high-performance computational pipeline for metagenomic data analysis
Metagenomics method directly sequences and analyzes genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomics data analysis include taxonomical and functional component of these genomes in the microbial community. Metagenomic data analysis is both data- and […]
Dec, 2
Efficient Cubic B-spline Image Interpolation on a GPU
Application of geometric transformation to images requires an interpolation step. When applied to image rotation, the presently most efficient GPU implementation for the cubic spline image interpolation still cost about 10 times as much as linear interpolation. This implementation involves two steps: a prefilter step performs a two-pass forward-backward recursive filter, then a cubic polynomial […]
Dec, 2
Massively Parallelized Monte Carlo Simulation and its Applications in Finance
In this paper, we propose, develop and implement a tool that increases the computational speed of exotic derivatives pricing at a fraction of the cost of traditional methods. Our paper focuses on investigating the computing efficiencies of GPU systems. We utilize the GPU’s natural parallelization capabilities to price financial instruments. We outline our implementation, solutions […]
Dec, 2
An error correction solver for linear systems: Evaluation of mixed precision implementations
This paper proposes an error correction method for solving linear systems of equations and the evaluation of an implementation using mixed precision techniques. While different technologies are available, graphic processing units (GPUs) have been established as particularly powerful coprocessors in recent years. For this reason, our error correction approach is focused on a CUDA implementation […]
Dec, 2
Auto-optimization of a Feature Selection Algorithm
Advanced visualization algorithms are typically computationally expensive but highly data parallel which make them attractive candidates for GPU architectures. However, porting algorithms on a GPU still remains a challenging process. The Mint programming model addresses this issue with its simple and high level interface. It targets the users who seek real-time performance without investing in […]
Dec, 2
Evaluation of Fermi Features for Data Mining Algorithms
A recent development in High Performance Computing is the availability of NVIDIA’s Fermi or the 20-series GPUs. These offer features such as inbuilt atomic double precision support and increased shared memory. This thesis focuses on optimizing and evaluating the new features offered by the Fermi series GPUs for data mining algorithms involving reductions. Using three […]
Dec, 2
Implementation of the FDTD Method Based on Lorentz-Drude Dispersive Model on GPU for Plasmonics Applications
We present a three-dimensional finite difference time domain (FDTD) method on graphics processing unit (GPU) for plasmonics applications. For the simulation of plasmonics devices, the Lorentz-Drude (LD) dispersive model is incorporated into Maxwell equations, while the auxiliary differential equation (ADE) technique is applied to the LD model. Our numerical experiments based on typical domain sizes […]
Dec, 2
Spotting Radio Transients with the help of GPUs
Exploration of the time-domain radio sky has huge potential for advancing our knowledge of the dynamic universe. Past surveys have discovered large numbers of pulsars, rotating radio transients and other transient radio phenomena; however, they have typically relied upon off-line processing to cope with the high data and processing rate. This paradigm rules out the […]
Dec, 1
A programming language interface to describe transformations and code generation
This paper presents a programming language interface, a complete scripting language, to describe composable compiler transformations. These transformation programs can be written, shared and reused by non-expert application and library developers. From a compiler writer’s perspective, a scripting language interface permits rapid prototyping of compiler algorithms that can mix levels and compose different sequences of […]