Posts
Mar, 23
Accelerating large-scale simulations of cortical neuronal network development
Cultured dissociated cortical cells grown into networks on multi-electrode arrays are used to investigate neuronal network development, activity, plasticity, response to stimuli, the effects of pharmacological agents, etc. We made a computational model of such a neuronal network and studied the interplay of individual neuron activity, cell culture development, and network behavior. For small networks […]
Mar, 23
CUDA implementation of Wagener’s 2D convex hull PRAM algorithm
This paper describes a CUDA implementation of Wagener’s PRAM convex hull algorithm in two dimensions. It is presented in Knuth’s literate programming style.
Mar, 23
Advanced Programming Platform for efficient use of Data Parallel Hardware
Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly interesting for scientific groups, which traditionally use mainly CPU as a work horse, and now can profit of the […]
Mar, 23
A Co-Prime Blur Scheme for Data Security in Video Surveillance
This paper presents a novel Coprime Blurred Pair (CBP) model for visual data-hiding for security in camera surveillance. While most previous approaches have focused on completely encrypting the video stream, we introduce a spatial encryption scheme by blurring the image/video contents to create a CBP. Our goal is to obscure detail in public video streams […]
Mar, 22
Impact of asynchronism on GPU accelerated parallel iterative computations
We study the impact of asynchronism on parallel iterative algorithms in the particular context of local clusters of workstations including GPUs. The application test is a classical PDE problem of advection-diffusion-reaction in 3D. We propose an asynchronous version of a previously developed PDE solver using GPUs for the inner computations. The algorithm is tested with […]
Mar, 22
Hierarchical N-body simulations with auto-tuning for heterogeneous systems
Algorithms designed to efficiently solve this classical problem of physics fit very well on GPU hardware, and exhibit excellent scalability on many GPUs. Their computational intensity makes them a promising approach for many other applications amenable to an N-body formulation. Adding features such as auto-tuning makes multipole-type algorithms ideal for heterogeneous computing environments.
Mar, 22
GPU-based parallel collision detection for fast motion planning
We present parallel algorithms to accelerate collision queries for sample-based motion planning. Our approach is designed for current many-core GPUs and exploits data-parallelism and multi-threaded capabilities. In order to take advantage of the high number of cores, we present a clustering scheme and collision-packet traversal to perform efficient collision queries on multiple configurations simultaneously. Furthermore, […]
Mar, 22
Multi-target vectorization with MTPS C++ generic library
This article introduces a C++ template library dedicated at vectorizing algorithms for different target architectures: Multi-Target Parallel Skeleton (MTPS). Skeletons describing the data structures and algorithms are provided and allow MTPS to generate a code with optimized memory access patterns for the choosen architecture. MTPS currently supports x86-64 multicore CPUs and CUDA enabled GPUs. On […]
Mar, 22
High Speed Compressed Sensing Reconstruction in Dynamic Parallel MRI Using Augmented Lagrangian and Parallel Processing
Magnetic Resonance Imaging (MRI) is one of the fields that the compressed sensing theory is well utilized to reduce the scan time significantly leading to faster imaging or higher resolution images. It has been shown that a small fraction of the overall measurements are sufficient to reconstruct images with the combination of compressed sensing and […]
Mar, 21
Accelerating Sparse Matrix Vector Multiplication on Many-Core GPUs
Many-core GPUs provide high computing ability and substantial bandwidth; however, optimizing irregular applications like SpMV on GPUs becomes a difficult but meaningful task. In this paper, we propose a novel method to improve the performance of SpMV on GPUs. A new storage format called HYB-R is proposed to exploit GPU architecture more efficiently. The COO […]
Mar, 21
Parallel Two-Stage Least Squares algorithms for Simultaneous Equations Models on GPU
Today it is usual to have computational systems formed by a multicore together with one or more GPUs. These systems are heterogeneous, due to the di erent types of memory in the GPUs and to the di erent speeds of computation of the cores in the CPU and the GPU. To accelerate the solution of […]
Mar, 21
Fast Antenna Characterization Using the Sources Reconstruction Method on Graphics Processors
The Sources Reconstruction Method (SRM) is a non-invasive technique for, among other applications, antenna characterization. The SRM is based on obtaining a distribution of equivalent currents that radiate the same field as the antenna under test. The computation of these currents requires solving a linear system, usually ill-posed, that may be very computationally demanding for […]