Posts
Jan, 17
Evaluating GPUs for network packet signature matching
Modern network devices employ deep packet inspection to enable sophisticated services such as intrusion detection, traffic shaping, and load balancing. At the heart of such services is a signature matching engine that must match packet payloads to multiple signatures at line rates. However, the recent transition to complex regular-expression based signatures coupled with ever-increasing network […]
Jan, 17
Acceleration of Acoustic Emission Signal Processing Algorithms using CUDA Standard
Offline processing of acoustic emission (AE) signal waveforms recorded during a long-term AE monitoring session is a challenging problem in AE testing area. This is due to the fact that today’s AE systems can work with up to hundreds of channels and are able to process tens of thousands of AE events per second. The […]
Jan, 17
Real-Time Non-rigid Registration of Medical Images on a Cooperative Parallel Architecture
Unacceptable execution time of Non-rigid registration (NRR) often presents a major obstacle to its routine clinical use. Parallel computing is an effective way to accelerate NRR. However, development of efficient parallel NRR codes is a very challenging task. One desirable approach is to map the existing sequential algorithm to the parallel architecture to gain speedup […]
Jan, 17
Parallelization Strategies for Ant Colony Optimisation on GPUs
Ant Colony Optimisation (ACO) is an effective population-based meta-heuristic for the solution of a wide variety of problems. As a population-based algorithm, its computation is intrinsically massively parallel, and it is there- fore theoretically well-suited for implementation on Graphics Processing Units (GPUs). The ACO algorithm comprises two main stages: Tour construction and Pheromone update. The […]
Jan, 16
Interactive visual analysis of contrast-enhanced ultrasound data based on local neighborhood statistics
Contrast-enhanced ultrasound (CEUS) has recently become an important technology for lesion detection and characterization in cancer diagnosis. CEUS is used to investigate the perfusion kinetics in tissue over time, which relates to tissue vascularization. In this paper we present a pipeline that enables interactive visual exploration and semi-automatic segmentation and classification of CEUS data. For […]
Jan, 16
An OpenCL framework for heterogeneous multicores with local memory
In this paper, we present the design and implementation of an Open Computing Language (OpenCL) framework that targets heterogeneous accelerator multicore architectures with local memory. The architecture consists of a general-purpose processor core and multiple accelerator cores that typically do not have any cache. Each accelerator core, instead, has a small internal local memory. Our […]
Jan, 16
Using generalized ensemble simulations and Markov state models to identify conformational states
Part of understanding a molecule’s conformational dynamics is mapping out the dominant metastable, or long lived, states that it occupies. Once identified, the rates for transitioning between these states may then be determined in order to create a complete model of the system’s conformational dynamics. Here we describe the use of the MSMBuilder package (now […]
Jan, 16
MPI-CUDA parallelization of a finite-strip program for geometric nonlinear analysis: A hybrid approach
A finite-strip geometric nonlinear analysis is presented for elastic problems involving folded-plate structures. Compared with the standard finite-element method, its main advantages are in data preparation, program complexity, and execution time. The finite-strip method, which satisfies the von Karman plate equations in the nonlinear elastic range, leads to the coupling of all harmonics. However, coupling […]
Jan, 16
A symbolic verifier for CUDA programs
We present a preliminary automated verifier based on mechanical decision procedures which is able to prove functional correctness of CUDA programs and guarantee to detect bugs such as race conditions. We also employ a symbolic partial order reduction (POR) technique to mitigate the interleaving explosion problem.
Jan, 16
Daubechies wavelets for high performance electronic structure calculations: The BigDFT project
In this contribution we will describe in detail a Density Functional Theory method based on a Daubechies wavelets basis set, named BigDFT. We will see that, thanks to wavelet properties, this code shows high systematic convergence properties, very good performances and an excellent efficiency for parallel calculations. BigDFT code operation are also well-suited for GPU […]
Jan, 16
Introduction to GPGPU, a hardware and software background
This article gives an introduction to GPU usage for High Performance Computing. After setting the context, we will describe the hardware and the programming languages currently available to programmers. From these explanations we will touch on the implications of these technologies for simulation codes and try to give trends for the future.
Jan, 16
Fluid-solid coupling on a cluster of GPU graphics cards for seismic wave propagation
We develop a hybrid multiGPUs and CPUs version of an algorithm to model seismic wave propagation based on the spectral-element method in the case of models of the Earth containing both fluid and solid layers. Thanks to the overlapping of communications between processing nodes on the computer with calculation by means of non-blocking message passing, […]