Posts
May, 11
Whole-function vectorization
Data-parallel programming languages are an important component in today’s parallel computing landscape. Among those are domain-specific languages like shading languages in graphics (HLSL, GLSL, RenderMan, etc.) and “general-purpose” languages like CUDA or OpenCL. Current implementations of those languages on CPUs solely rely on multi-threading to implement parallelism and ignore the additional intra-core parallelism provided by […]
May, 11
Gemma in April: A matrix-like parallel programming architecture on OpenCL
Nowadays, Graphics Processing Unit (GPU), as a kind of massive parallel processor, has been widely used in general purposed computing tasks. Although there have been mature development tools, it is not a trivial task for programmers to write GPU programs. Based on this consideration, we propose a novel parallel computing architecture. The architecture includes a […]
May, 11
High performance memetic algorithm particle filter for multiple object tracking on modern GPUs
This work presents an effective approach to visual tracking using a graphics processing unit (GPU) for computation purposes. In order to get a performance improvement against other platforms it is convenient to select proper algorithms such as population-based ones. They expose a parallel-friendly nature needing from many independent evaluations that map well to the parallel […]
May, 10
Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory
In this paper, we describe a runtime to automatically enhance the performance of applications running on heterogeneous platforms consisting of a multi-core (CPU) and a throughput-oriented many-core (GPU). The CPU and GPU are connected by a non-coherent interconnect such as PCI-E, and as such do not have shared memory. Heterogeneous platforms available today such as […]
May, 10
The GPU on the simulation of cellular computing models
Membrane Computing is a discipline aiming to abstract formal computing models, called membrane systems or P systems, from the structure and functioning of the living cells as well as from the cooperation of cells in tissues, organs, and other higher order structures. This framework provides polynomial time solutions to NP-complete problems by trading space for […]
May, 10
Exact and complete short-read alignment to microbial genomes using Graphics Processing Unit programming
MOTIVATION: The introduction of next-generation sequencing techniques and especially the high-throughput systems Solexa (Illumina Inc.) and SOLiD (ABI) made the mapping of short reads to reference sequences a standard application in modern bioinformatics. Short-read alignment is needed for reference based re-sequencing of complete genomes as well as for gene expression analysis based on transcriptome sequencing. […]
May, 10
Fast Parallel Tandem Mass Spectral Library Searching Using GPU Hardware Acceleration
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and […]
May, 10
Accelerating image registration of MRI by GPU-based parallel computation
Automatic image registration for MRI applications generally requires many iteration loops and is, therefore, a time-consuming task. This drawback prolongs data analysis and delays the workflow of clinical routines. Recent advances in the massively parallel computation of graphic processing units (GPUs) may be a solution to this problem. This study proposes a method to accelerate […]
May, 10
Astrophysical particle simulations with large custom GPU clusters on three continents
We present direct astrophysical N-body simulations with up to six million bodies using our parallel MPI-CUDA code on large GPU clusters in Beijing, Berkeley, and Heidelberg, with different kinds of GPU hardware. The clusters are linked in the cooperation of ICCS (International Center for Computational Science). We reach about one third of the peak performance […]
May, 10
Stock trading strategy creation using GP on GPU
This paper investigates the speed improvements available when using a graphics processing unit (GPU) for evaluation of individuals in a genetic programming (GP) environment. An existing GP system is modified to enable parallel evaluation of individuals on a GPU device. Several issues related to implementing GP on GPU are discussed, including how to perform tree-based […]
May, 10
Parallel preconditioned conjugate gradient algorithm on GPU
We propose a parallel implementation of the Preconditioned Conjugate Gradient algorithm on a GPU-platform. The preconditioning matrix is an approximate inverse derived from the SSOR preconditioner. Used through sparse matrix-vector multiplication, the proposed preconditioner is well-suited for the massively parallel GPU architecture. As compared to CPU implementation of the conjugate gradient algorithm, our GPU preconditioned […]
May, 10
Efficient Parallelization of the Stochastic Simulation Algorithm for Chemically Reacting Systems On the Graphics Processing Unit
The small number of some reactant molecules in biological systems formed by living cells can result in dynamical behavior which cannot be captured by traditional deterministic models. In such a problem, a more accurate simulation can be obtained with discrete stochastic simulation (Gillespie’s stochastic simulation algorithm – SSA). Many stochastic realizations are required to capture […]