Posts
Oct, 21
Computational Fluid Dynamics Using Graphics Processing Units: Challenges and Opportunities
A new paradigm for computing fluid flows is the use of Graphics Processing Units (GPU), which have recently become very powerful and convenient to use. In the past three years, we have implemented five different fluid flow algorithms on GPUs and have obtained significant speed-ups over a single CPU. Typically, it is possible to achieve […]
Oct, 21
DOPA: GPU-based protein alignment using database and memory access optimizations
BACKGROUND: Smith-Waterman (S-W) algorithm is an optimal sequence alignment method for biological databases, but its computational complexity makes it too slow for practical purposes. Heuristics based approximate methods like FASTA and BLAST provide faster solutions but at the cost of reduced accuracy. Also, the expanding volume and varying lengths of sequences necessitate performance efficient restructuring […]
Oct, 21
Massively parallel computation using graphics processors with application to optimal experimentation in dynamic control
The rapid growth in the performance of graphics hardware, coupled with recent improvements in its programmability has lead to its adoption in many non-graphics applications, including a wide variety of scientific computing fields. At the same time, a number of important dynamic optimal policy problems in economics are athirst of computing power to help overcome […]
Oct, 21
Accelerating exotic option pricing and model calibration using GPUs
Pricing and risk analysis for today’s exotic structured equity products is computationally more and more demanding and time consuming. GPUs offer the possibility to significantly increase computing performance even at reduced costs. We applied this technology to replace a large amount of our CPU based computing grid by hybrid GPU/CPU pricing engines. One GPU based […]
Oct, 21
OpenMP for Accelerators
OpenMP [14] is the dominant programming model for shared-memory parallelism in C, C++ and Fortran due to its easy-to-use directive-based style, portability and broad support by compiler vendors. Compute-intensive application regions are increasingly being accelerated using devices such as GPUs and DSPs, and a programming model with similar characteristics is needed here. This paper presents […]
Oct, 21
Efficient Synchronization Primitives for GPUs
In this paper, we revisit the design of synchronization primitives—specifically barriers, mutexes, and semaphores—and how they apply to the GPU. Previous implementations are insufficient due to the discrepancies in hardware and programming model of the GPU and CPU. We create new implementations in CUDA and analyze the performance of spinning on the GPU, as well […]
Oct, 20
Automatic program analysis for data parallel kernels
It is widely known that GPUs have more computational power and expose a far greater level of parallelism than conventional CPUs. Despite their high potential, GPUs are not yet a popular choice in practice, mainly because of their high programming complexity. The complexity derives from two factors. First, the existing programming models are tied to […]
Oct, 20
GPU Accelerated X-Ray Image Enhancement
This paper presents an automated method for preparing digital X-rays for use by a procedural mesh generator. This process will facilitate the generation of a 3D polygon mesh depicting the bones contained within the X-ray image. The process of preparing the image involves identifying and retaining bone elements whilst removing any superfluous aspects contained within […]
Oct, 20
Enabling Computational Dynamics in Distributed Computing Environments Using a Heterogeneous Computing Template
This paper describes a software infrastructure made up of tools and libraries designed to assist developers in implementing computational dynamics applications running on heterogeneous and distributed computing environments. Together, these tools and libraries compose a so called Heterogeneous Computing Template (HCT). The underlying theme of the solution approach embraced by HCT is that of partitioning […]
Oct, 20
A high performance computing framework for physics-based modeling and simulation of military ground vehicles
This paper describes a software infrastructure made up of tools and libraries designed to assist developers in implementing computational dynamics applications running on heterogeneous and distributed computing environments. Together, these tools and libraries compose a so called Heterogeneous Computing Template (HCT). The heterogeneous and distributed computing hardware infrastructure is assumed herein to be made up […]
Oct, 20
Experimental B+-tree for GPU
The main intention of this work is to create a dictionary structure which could benefit from massive parallelism of threads when performing computation on all or a selected set of elements, while having an ability to search for and insert keys very quickly, yet preserving the order of elements. So far, no such structure dedicated […]
Oct, 20
PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems
PEPPHER, a three-year European FP7 project, addresses efficient utilization of hybrid (heterogeneous) computer systems consisting of multicore CPUs with GPU-type accelerators. This article outlines the PEPPHER performance-aware component model, performance prediction means, runtime system, and other aspects of the project. A larger example demonstrates performance portability with the PEPPHER approach across hybrid systems with one […]