Posts
Nov, 20
GPUs for fast pattern matching in the RICH of the NA62 experiment
In rare decays experiments an effective online selection is a fundamental part of the data acquisition system (DAQ) in order to reduce both the quantity of data written on tape and the bandwidth requirements for the DAQ system. A multilevel architecture is commonly used to achieve a higher reduction factor, exploiting dedicated custom hardware and […]
Nov, 20
Patient-Specific Non-Linear Finite Element Modelling for Predicting Soft Organ Deformation in Real-Time; Application to Non-Rigid Neuroimage Registration
Long computation times of non-linear (i.e. accounting for geometric and material non-linearity) biomechanical models have been regarded as one of the key factors preventing application of such models in predicting organ deformation for image-guided surgery. This contribution presents real-time patient-specific computation of the deformation field within the brain for six cases of brain shift induced […]
Nov, 20
Visualization of LIDAR datasets using point-based rendering technique
Remote sensing technologies, such as LIDAR, rapidly evolve and produce large datasets. The computers used to visualize these data have limited resources, which prevent detailed and real-time visualization. An approach to real-time visualization of virtually unlimited LIDAR datasets, at full detail with a hierarchical and out-of-core approach to data management and a modern point-based rendering […]
Nov, 20
Skeleton and Shape Adjustment and Tracking in Multicamera Environments
In this paper we present a method for automatic body model adjustment and motion tracking in multicamera environments. We introduce a set of shape deformation parameters based on linear blend skinning, that allow a deformation related to the scaling of the distinct bones of the body model skeleton, and a deformation in the radial direction […]
Nov, 20
Regular Lattice and Small-World Spin Model Simulations Using CUDA and GPUs
Data-parallel accelerator devices such as Graphical Processing Units (GPUs) are providing dramatic performance improvements over even multi-core CPUs for lattice-oriented applications in computational physics. Models such as the Ising and Potts models continue to play a role in investigating phase transitions on small-world and scale-free graph structures. These models are particularly well-suited to the performance […]
Nov, 20
An instruction-systolic programmable shader architecture for multi-threaded 3D graphics processing
In order to guarantee both performance and programmability demands in 3D graphics applications, vector and multithreaded SIMD architectures have been employed in recent graphics processing units. This paper introduces a novel instruction-systolic array architecture, which transfers an instruction stream in a pipelined fashion to efficiently share the expensive functional resources of a graphics processor. Specifically, […]
Nov, 20
A middleware for efficient stream processing in CUDA
This paper presents a middleware capable of out-of-order execution of kernels and data transfers for efficient stream processing in the compute unified device architecture (CUDA). Our middleware runs on the CUDA-compatible graphics processing unit (GPU). Using the middleware, application developers are allowed to easily overlap kernel computation with data transfer between the main memory and […]
Nov, 20
Simulating a P system based efficient solution to SAT by using GPUs
P systems are inherently parallel and non-deterministic theoretical computing devices defined inside the field of Membrane Computing. Many P system simulators have been presented in this area, but they are inefficient since they can not handle the parallelism of these devices. Nowadays, we are witnessing the consolidation of the GPUs as a parallel framework to […]
Nov, 20
Simulation of one-layer shallow water systems on multicore and CUDA architectures
The numerical solution of shallow water systems is useful for several applications related to geophysical flows, but the big dimensions of the domains suggests the use of powerful accelerators to obtain numerical results in reasonable times. This paper addresses how to speed up the numerical solution of a first order well-balanced finite volume scheme for […]
Nov, 20
Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort
Sort is a fundamental kernel used in many database operations. In-memory sorts are now feasible; sort performance is limited by compute flops and main memory bandwidth rather than I/O. In this paper, we present a competitive analysis of comparison and non-comparison based sorting algorithms on two modern architectures – the latest CPU and GPU architectures. […]
Nov, 20
SBLOCK: A Framework for Efficient Stencil-Based PDE Solvers on Multi-core Platforms
We present a new software framework for the implementation of applications that use stencil computations on block-structured grids to solve partial differential equations. A key feature of the framework is the extensive use of automatic source code generation which is used to achieve high performance on a range of leading multi-core processors. Results are presented […]
Nov, 20
A GPGPU Transparent Virtualization Component for High Performance Computing Clouds
The GPU Virtualization Service (gVirtuS) presented in this work tries to fill the gap between in-house hosted computing clusters, equipped with GPGPUs devices, and pay-for-use high performance virtual clusters deployed via public or private computing clouds. gVirtuS allows an instanced virtual machine to access GPGPUs in a transparent and hypervisor independent way, with an overhead […]