For many finite element problems, when represented as sparse matrices, iterative solvers are found to be unreliable because they can impose computational bottlenecks. Early pioneering work by Duff et al, explored an alternative strategy called multifrontal sparse matrix factorization. This approach, by representing the sparse problem as a tree of dense systems, maps well to […]

The performance potential for simulating quantum electron transport on graphical processing units (GPUs) is studied. Using graphene ribbons of realistic sizes as an example it is shown that GPUs provide significant speed-ups in comparison to central processing units as the transverse dimension of the ribbon grows. The recursive Green’s function algorithm is employed and implementation […]

Space Filling Curves (SFC) are particularly useful in linearization of data living in two and three dimensional spaces and have been used in a number of applications in scientific computing, and visualization. Interestingly, octrees, another versatile data structure in computer graphics, can be viewed as multiple SFCs at varying resolutions, albeit with parent-child relationship. In […]

In this paper we investigate the use of GPUs as digital beamformers. We specify a parallel implementation of a beamformer in time and frequency domain and measure its performance. We also give examples of the processing limits of NVIDIA Geforce 8800 GPU with respect to application parameters: number of sensors, sampling frequency, bandwidth, and number […]

Motivation: The importance of stochasticity in biological systems is becoming increasingly recognised and the computational cost of biologically realistic stochastic simulations urgently requires development of efficient software. We present a new software tool STOCHSIMGPU which exploits graphics processing units (GPUs) for parallel stochastic simulations of biological/chemical reaction systems and show that significant gains in efficiency […]

High performance computing is critical for financial markets where analysts seek to accelerate complex optimizations such as pricing engines to maintain a competitive edge. In this paper we investigate the performance of financial workloads on the Sony-Toshiba- IBM Cell Broadband Engine, a heterogeneous multicore chip architected for intensive gaming applications and high performance computing. We […]

A general methodology for noise reduction and contrast enhancement in very noisy image data with low dynamic range is presented. Video footage recorded in very dim light is especially targeted. Smoothing kernels that automatically adapt to the local spatio-temporal intensity structure in the image sequences are constructed in order to preserve and enhance fine spatial […]

Gather and scatter are two fundamental data-parallel operations, where a large number of data items are read (gathered) from or are written (scattered) to given locations. In this paper, we study these two operations on graphics processing units (GPUs).

Position Weight Matrices (PWMs) are broadly used in computational biology. The basic problems, Scan and Multiscan, aim to find all the occurrences of a given PWM or a set of PWMs in long sequences. Some other PWM tasks share a common NP-hard subproblem, ScoreDistribution The existing algorithms rely on the enumeration on a large set […]

