Posts
Feb, 1
High Performance Computing of Dynamic Structural Response Analysis for the Integrated Earthquake Simulation
This paper proposes an application of high performance computing (HPC) to dynamic structural response analysis (DSRA) in order to enhance the capability and increase the efficiency of integrated earthquake simulation (IES). Object Based Structural Analysis (OBASAN) is a candidate DSRA program for IES. With OBASAN, the reliability of structural damage prediction can be increased by […]
Feb, 1
Survey on Efficient Linear Solvers for Porous Media Flow Models on Recent Hardware Architectures
In the pastfew years, High Performance Computing (HPC) technologies led to General Purpose Processing on Graphics Processing Units (GPGPU) and many-core architectures. These emerging technologies offer massive processing units and are interesting for porous media flow simulators may used for CO2 geological sequestration or Enhanced Oil Recovery (EOR) simulation. However the crucial point is "are […]
Jan, 30
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes
The ongoing hardware evolution exhibits an escalation in the number, as well as in the heterogeneity, of the computing resources. The pressure to maintain reasonable levels of performance and portability, forces the application developers to leave the traditional programming paradigms and explore alternative solutions. PaStiX is a parallel sparse direct solver, based on a dynamic […]
Jan, 30
Towards Efficient Risk Quantification-Using GPUs and Variance Reduction Technique
Value-at-Risk (VaR) provides information about global risk in trading. The request for high speed calculation about VaR is rising because financial institutions need to measure the risk in real time. Researchers in HPC also recently turned their attention on this kind of demanding applications. In this master thesis, we introduce two complementary and different strategies […]
Jan, 30
A Novel Graphical Processing Unit Method for Power Systems Security Analysis
There is an increasing need for computational power to drive software tools used in power systems planning and operations, since the emergence of modern energy markets and recent renewable generation technology fundamentally alters how energy flows through the existing power grid. While special-purpose hardware, including supercomputers, has been explored for this purpose, inexpensive commodity hardware […]
Jan, 30
Comparing the Performance of Different x86 SIMD Instruction Sets for a Medical Imaging Application on Modern Multi- and Manycore Chips
Single Instruction, Multiple Data (SIMD) vectorization is a major driver of performance in current architectures, and is mandatory for achieving good performance with codes that are limited by instruction throughput. We investigate the efficiency of different SIMD-vectorized implementations of the RabbitCT benchmark. RabbitCT performs 3D image reconstruction by back projection, a vital operation in computed […]
Jan, 30
GPU-Accelerated BWT Construction for Large Collection of Short Reads
Advances in DNA sequencing technology have stimulated the development of algorithms and tools for processing very large collections of short strings (reads). Short-read alignment and assembly are among the most well-studied problems. Many state-of-the-art aligners, at their core, have used the Burrows-Wheeler transform (BWT) as a main-memory index of a reference genome (typical example, NCBI […]
Jan, 30
A GPU accelerated algorithm for 3D Delaunay triangulation
We propose the first algorithm to compute the 3D Delaunay triangulation (DT) on the GPU. Our algorithm uses massively parallel point insertion followed by bilateral flipping, a powerful local operation in computational geometry. Although a flipping algorithm is very amenable to parallel processing and has been employed to construct the 2D DT and the 3D […]
Jan, 30
A CUDA Monte Carlo simulator for radiation therapy dosimetry based on Geant4
Geant4 is a large-scale particle physics package that facilitates every aspect of particle transport simulation. This includes, but is not limited to, geometry description, material definition, tracking of particles passing through and interacting with matter, storage of event data, and visualization. As more detailed and complex simulations are required in different application domains, there is […]
Jan, 30
A QUDA-branch to compute disconnected diagrams in GPUs
Although QUDA allows for an efficient computation of many QCD quantities, it is surprinsingly lacking tools to evaluate disconnected diagrams, for which GPUs are specially well suited. We aim to fill this gap by creating our own branch of QUDA, which includes new kernels and functions required to calculate fermion loops using several methods and […]
Jan, 29
A Detailed GPU Cache Model Based on Reuse Distance Theory
As modern GPUs rely partly on their on-chip memories to counter the imminent off-chip memory wall, the efficient use of their caches has become important for performance and energy. However, optimising cache locality systematically requires insight into and prediction of cache behaviour. On sequential processors, stack distance or reuse distance theory is a well-known means […]
Jan, 29
Hybrid algorithms for efficient Cholesky decomposition and matrix inverse using multicore CPUs with GPU accelerators
The use of linear algebra routines is fundamental to many areas of computational science, yet their implementation in software still forms the main computational bottleneck in many widely used algorithms. In machine learning and computational statistics, for example, the use of Gaussian distributions is ubiquitous, and routines for calculating the Cholesky decomposition, matrix inverse and […]

