7428

Posts

Mar, 26

GPUstore: Harnessing GPU Computing for Storage Systems in the OS Kernel

Many storage systems include computationally expensive components. Examples include encryption for confidentiality, checksums for integrity, and error correcting codes for reliability. As storage systems become larger, faster, and serve more clients, the demands placed on their computational components increase and they can become performance bottlenecks. Many of these computational tasks are inherently parallel: they can […]
Mar, 26

Full system simulation of many-core heterogeneous SoCs using GPU and QEMU semihosting

Modern system-on-chips are evolving towards complex and heterogeneous platforms with general purpose processors coupled with massively parallel manycore accelerator fabrics (e.g. embedded GPUs). Platform developers are looking for efficient full-system simulators capable of simulating complex applications, middleware and operating systems on these heterogeneous targets. Unfortunately current virtual platforms are not able to tackle the complexity […]
Mar, 23

Volume-preserving FFD for programmable graphics hardware

Free-Form Deformation (FFD) is a well established technique for deforming arbitrary object shapes in space. Although more recent deformation techniques have been introduced, among them skeleton-based deformation and cage-based deformation, the simple and versatile nature of FFD is a strong advantage, and justifies its presence in nowadays leading commercial geometric modeling and animation software systems. […]
Mar, 23

Accelerating large-scale simulations of cortical neuronal network development

Cultured dissociated cortical cells grown into networks on multi-electrode arrays are used to investigate neuronal network development, activity, plasticity, response to stimuli, the effects of pharmacological agents, etc. We made a computational model of such a neuronal network and studied the interplay of individual neuron activity, cell culture development, and network behavior. For small networks […]
Mar, 23

CUDA implementation of Wagener’s 2D convex hull PRAM algorithm

This paper describes a CUDA implementation of Wagener’s PRAM convex hull algorithm in two dimensions. It is presented in Knuth’s literate programming style.
Mar, 23

Advanced Programming Platform for efficient use of Data Parallel Hardware

Graphics processing units (GPU) had evolved from a specialized hardware capable to render high quality graphics in games to a commodity hardware for effective processing blocks of data in a parallel schema. This evolution is particularly interesting for scientific groups, which traditionally use mainly CPU as a work horse, and now can profit of the […]
Mar, 23

A Co-Prime Blur Scheme for Data Security in Video Surveillance

This paper presents a novel Coprime Blurred Pair (CBP) model for visual data-hiding for security in camera surveillance. While most previous approaches have focused on completely encrypting the video stream, we introduce a spatial encryption scheme by blurring the image/video contents to create a CBP. Our goal is to obscure detail in public video streams […]
Mar, 22

Impact of asynchronism on GPU accelerated parallel iterative computations

We study the impact of asynchronism on parallel iterative algorithms in the particular context of local clusters of workstations including GPUs. The application test is a classical PDE problem of advection-diffusion-reaction in 3D. We propose an asynchronous version of a previously developed PDE solver using GPUs for the inner computations. The algorithm is tested with […]
Mar, 22

Hierarchical N-body simulations with auto-tuning for heterogeneous systems

Algorithms designed to efficiently solve this classical problem of physics fit very well on GPU hardware, and exhibit excellent scalability on many GPUs. Their computational intensity makes them a promising approach for many other applications amenable to an N-body formulation. Adding features such as auto-tuning makes multipole-type algorithms ideal for heterogeneous computing environments.
Mar, 22

GPU-based parallel collision detection for fast motion planning

We present parallel algorithms to accelerate collision queries for sample-based motion planning. Our approach is designed for current many-core GPUs and exploits data-parallelism and multi-threaded capabilities. In order to take advantage of the high number of cores, we present a clustering scheme and collision-packet traversal to perform efficient collision queries on multiple configurations simultaneously. Furthermore, […]
Mar, 22

Multi-target vectorization with MTPS C++ generic library

This article introduces a C++ template library dedicated at vectorizing algorithms for different target architectures: Multi-Target Parallel Skeleton (MTPS). Skeletons describing the data structures and algorithms are provided and allow MTPS to generate a code with optimized memory access patterns for the choosen architecture. MTPS currently supports x86-64 multicore CPUs and CUDA enabled GPUs. On […]
Mar, 22

High Speed Compressed Sensing Reconstruction in Dynamic Parallel MRI Using Augmented Lagrangian and Parallel Processing

Magnetic Resonance Imaging (MRI) is one of the fields that the compressed sensing theory is well utilized to reduce the scan time significantly leading to faster imaging or higher resolution images. It has been shown that a small fraction of the overall measurements are sufficient to reconstruct images with the combination of compressed sensing and […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: