Posts
Dec, 18
Efficient Multi-GPU Computation of All-Pairs Shortest Paths
We describe a new algorithm for solving the all-pairs shortest-path (APSP) problem for planar graphs and graphs with small separators that exploits the massive on-chip parallelism available in today’s Graphics Processing Units (GPUs). Our algorithm, based on the Floyd-Warshall algorithm, has near optimal complexity in terms of the total number of operations, while its matrix-based […]
Dec, 18
A comparative analysis of the performance and deployment overhead of parallelized Finite Difference Time Domain (FDTD) algorithms on a selection of high performance multiprocessor computing systems
The parallel FDTD method as used in computational electromagnetics is implemented on a variety of different high performance computing platforms. These parallel FDTD implementations have regularly been compared in terms of performance or purchase cost, but very little systematic consideration has been given to how much effort has been used to create the parallel FDTD […]
Dec, 18
GPU Accelerated Semiclassical Initial Value Representation Molecular Dynamics
This paper presents a graphics processing units (GPUs) implementation of the semiclassical initial value representation (SC-IVR) propagator for vibrational molecular spectroscopy calculations. The time-averaging formulation of the SC-IVR for power spectrum calculations is employed. Details about the CUDA implementation of the semiclassical code are provided. 4 molecules with an increasing number of atoms are considered […]
Dec, 17
Data Structures for Task-based Priority Scheduling
Many task-parallel applications can benefit from attempting to execute tasks in a specific order, as for instance indicated by priorities associated with the tasks. We present three lock-free data structures for priority scheduling with different trade-offs on scalability and ordering guarantees. First we propose a basic extension to work-stealing that provides good scalability, but cannot […]
Dec, 17
Development methodologies for GPU and cluster of GPUs
This chapter proposes to draw several development methodologies to obtain efficient codes in classical scientific applications. Those methodologies are based on the feedback from several research works involving GPUs, either alone in a single machine or in a cluster of machines. Indeed, our past collaborations with industries have allowed us to point out that in […]
Dec, 17
OpenCL Accelerated Multi-GPU Cone-Beam Reconstruction
Volume reconstruction in cone-beam CT is a computationally demanding task. Since recent years, the reconstruction is accelerated by utilizing Graphics Processing Units (GPUs). Frameworks for General Purpose Computations on GPUs are proven tool to access the resources of graphics cards. WIth the Open Computing Language (OpenCL) the first open standard for cross-vendor and cross-platform programming […]
Dec, 17
High Performance Computing Image Analysis for Radiotherapy Planning
The Edinburgh Cancer Centre at the Western General Hospital in Edinburgh is doing research on image analysis for predicting lung fibrosis induced by radiation as part of a treatment plan. They are developing a MATLAB code to analyse three dimensional Computed tomography (CT) images of patients but, because a standard three dimensional CT image is […]
Dec, 17
Parallel Firewalls on General-Purpose Graphics Processing Units
Firewalls use a rule database to decide which packets will be allowed from one network onto another thereby implementing a security policy. In high-speed networks as the inter-arrival rate of packets decreases, the latency incurred by a firewall increases. In such a scenario, a single firewall become a bottleneck and reduces the overall throughput of […]
Dec, 16
Ray-Traced Collision Detection: Interpenetration Control and Multi-GPU Performance
We proposed [LGA13] an iterative ray-traced collision detection algorithm (IRTCD) that exploits spatial and temporal coherency and proved to be computationally efficient but at the price of some geometrical approximations that allow more interpenetration than needed. In this paper, we present two methods to efficiently control and reduce the interpenetration without noticeable computation overhead. The […]
Dec, 16
Exploiting parallel features of modern computer architectures in bioinformatics
The exponential growth in bioinformatics data generation and the stagnation of processor frequencies in modern processors stress the need for efficient implementations that fully exploit the parallel capabilities offered by modern computers. This thesis focuses on parallel algorithms and implementations for bioinformatics problems. Various types of parallelism are described and exploited. This thesis presents applications […]
Dec, 16
A Cloud Computing Service Architecture of a Parallel Algorithm Oriented to Scientific Computing with CUDA and Monte Carlo
The GPGPU (General Purpose Graphics Processing Units) have become a whole new area for research due to the fast development of GPU hardware and programming tools, such as CUDA (Compute Unified Device Architecture). Here we have made a research on CUDA and its applications in the field of scientific computation, as organ electronics. We propose […]
Dec, 16
Parallel GPU Implementation of Hough Transform for Circles
Hough transform is one of the most widely used algorithms in image processing. The major problems of Hough’s transform are its time consuming and its abundant requirement of computational resources. In this paper, we try to solve this problem by paralleling this algorithm and implementing it on GPUs (Graphic Process unit) using CUDA (Compute Unified […]