We present a block structured orthogonal factorization (BSOF) algorithm and its parallelization for computing the inversion of block p-cyclic matrices.We aim at the high performance on multicores with GPU accelerators. We provide a quantitative performance model for optimal host-device load balance, and validate the model through numerical tests. Benchmarking results show that the parallel BSOF […]

August 23, 2014 by hgpu

A parallel implementation of a method of the semi-Lagrangian type for the advection equation on a hybrid architecture com-putation system is discussed. The difference scheme with variable stencil is constructed on the base of an integral equality between the neighboring time levels. The proposed approach allows one to avoid the Courant-Friedrichs-Lewy restriction on the relation […]

August 21, 2014 by hgpu

This book brings together research on numerical methods adapted for Graphics Processing Units (GPUs). It explains recent efforts to adapt classic numerical methods, including solution of linear equations and FFT, for massively parallel GPU architectures. This volume consolidates recent research and adaptations, covering widely used methods that are at the core of many scientific and […]

August 13, 2014 by hgpu

CUMODP is a CUDA library for exact computations with dense polynomials over finite fields. A variety of operations like multiplication, division, computation of subresultants, multi-point evaluation, interpolation and many others are provided. These routines are primarily designed to offer GPU support to polynomial system solvers and a bivariate system solver is part of the library. […]

August 7, 2014 by hgpu

The solution of large-scale Lyapunov equations is an important tool for the solution of several engineering problems arising in optimal control and model order reduction. In this work we investigate the case when the coefficient matrix of the equations presents a band structure. Exploiting the structure of this matrix we can achive relevant reductions in […]

August 3, 2014 by hgpu

The sequential Monte Carlo (smc) methods have been widely used for modern scientific computation. Bayesian model comparison has been successfully applied in many fields. Yet there have been few researches on the use of smc for the purpose of Bayesian model comparison. This thesis studies different smc strategies for Bayesian model computation. In addition, various […]

July 28, 2014 by hgpu

Two new algorithms for numerical solution of static Hamilton-Jacobi equations are presented. These algorithms are designed to work efficiently on different parallel computing architectures, and numerical results for multicore CPU and GPU implementations are reported and discussed. The numerical experiments show that the proposed solution strategies scale well with the computational power of the hardware. […]

July 26, 2014 by hgpu

CONTEXT: Reinforcement Learning (RL) is a time consuming effort that requires a lot of computational power as well. There are mainly two approaches to improving RL efficiency, the theoretical mathematics and algorithmic approach or the practical implementation approach. In this study, the approaches are combined in an attempt to reduce time consumption. OBJECTIVES: We investigate […]

July 9, 2014 by hgpu

Parallel computing is a topic that became very popular in the last few decades. Parallel computers are being used in many different areas of science such as astrophysics, climate modelling, quantum chemistry, fluid dynamics and medicine. Parallel programming is a type of programming where computations can be performed concurrently on different processors or devices. There […]

July 7, 2014 by hgpu

This paper deals with a Galerkin-based multi-scale time integration of a viscoelastic rope model. Using Hamilton’s dynamical formulation, Newton’s equation of motion as a second-order partial differential equation is transformed into two coupled first order partial differential equations in time. The considered finite viscoelastic deformations are described by means of a deformation-like internal variable determined […]

April 24, 2014 by hgpu

This paper presents initial experiments in implementing two notable matrix multiplication algorithms – the DNS algorithm and Cannon’s algorithm – using NVIDIA’s general-purpose graphics processing units (GPGPUs) and CUDA development platform. We demonstrate that these implementations are comparable with traditional methods in terms of computational expense and may scale better than traditional techniques.

April 3, 2014 by hgpu

This report gives a brief introduction to the interpolation with radial basis functions and it’s application to the deformation of computational grids. The FGP algorithm is quoted as an iterative method for the calculation of the interpolation coefficients. A multipole method is described for the efficient approximation of the required matrix-vector product. Results are presented […]

March 25, 2014 by hgpu