In the near future, massively parallel computing systems will be necessary to solve computation intensive applications. The key bottleneck in massively parallel implementation of numerical algorithms is the synchronization of data across processing elements (PEs) after each iteration, which results in significant idle time. Thus, there is a trend towards relaxing the synchronization and adopting […]

March 18, 2015 by hgpu

In this thesis we present the first, to our knowledge, implementation and performance analysis of Hermite methods on GPU accelerated systems. We give analytic background for Hermite methods; give implementations of the Hermite methods on traditional CPU systems as well as on GPUs; give the reader background on basic CUDA programming for GPUs; discuss performance […]

February 2, 2015 by hgpu

In an ideal world, scientific applications are computationally efficient, maintainable and composable and allow scientists to work very productively. We argue that these goals are achievable for a specific application field by choosing suitable domain-specific abstractions that encapsulate domain knowledge with a high degree of expressiveness. This thesis demonstrates the design and composition of domain-specific […]

February 1, 2015 by hgpu

The High Performance Conjugate Gradient (HPCG) benchmark has been recently proposed as a complement to the High Performance Linpack (HPL) benchmark currently used to rank supercomputers in the Top500 list. This new benchmark solves a large sparse linear system using a multigrid preconditioned conjugate gradient (PCG) algorithm. The PCG algorithm contains the computational and communication […]

November 29, 2014 by hgpu

In view of the rapid rise of the number of cores in modern supercomputers, time-parallel methods that introduce concurrency along the temporal axis are becoming increasingly popular. For the solution of time-dependent partial differential equations, these methods can add another direction for concurrency on top of spatial parallelization. The paper presents an implementation of the […]

October 3, 2014 by hgpu

As parallel and heterogeneous computing becomes more and more a necessity for implementing high performance simulators, it becomes increasingly harder for scientists and engineers without experience in high performance computing to achieve good performance. Even for those who knows how to write efficient code the process for doing so is time consuming and error prone, […]

September 8, 2014 by hgpu

A model of a multilayer device with non-trivial geometrical and material structure and its working process is suggested. The thermal behavior of the device as one principle characteristic is simulated. The algorithm for solving the non-stationary heat conduction problem with a time-dependent periodical heating source is suggested. The algorithm is based on finite difference explicit–implicit […]

August 27, 2014 by hgpu

Multiresolution Analysis (MRA) is a mathematical method that is based on working on a problem at different scales. One of its applications is medical imaging where processing at multiple scales, based on the concept of Gaussian and Laplacian image pyramids, is a well-known technique. It is often applied to reduce noise while preserving image detail […]

August 23, 2014 by hgpu

The last decade saw the long tradition of frequency scaling of processing units grind to a halt, and efforts were re-focused on maintaining computational growth by other means; such as increased parallelism, deep memory hierarchies and complex execution logic. After a long period of "boring productivity", a host of new architectures, accelerators, programming languages and […]

July 24, 2014 by hgpu

The numerical solution of partial differential equations using the finite element method is one of the key applications of high performance computing. Local assembly is its characteristic operation. This entails the execution of a problem-specific kernel to numerically evaluate an integral for each element in the discretized problem domain. Since the domain size can be […]

July 10, 2014 by hgpu

Many problems in computational science and engineering involve partial differential equations and thus require the numerical solution of large, sparse (non)linear systems of equations. Multigrid is known to be one of the most efficient methods for this purpose. However, the concrete multigrid algorithm and its implementation highly depend on the underlying problem and hardware. Therefore, […]

June 23, 2014 by hgpu

This tutorial is written for beginners as an introduction to basic wave propagation using nite dierence method, from acoustic and elastic wave modeling, to reverse time migration and full waveform inversion. Most of the theoretical delineations summarized in this tutorial have been implemented in Madagascar with Matlab, C and CUDA programming, which will benet readers’ […]

June 8, 2014 by hgpu