3041

Posts

Feb, 18

Sparse systems solving on GPUs with GMRES

Scientific applications very often rely on solving one or more linear systems. When matrices are sparse, iterative methods are preferred to direct ones. Nevertheless, the value of nonzero elements and their distribution (i.e., the sketch of the matrix) greatly influence the efficiency of those methods (in terms of computation time, number of iterations, result precision) […]
Feb, 18

Accelerating Power Flow studies on Graphics Processing Unit

This paper presents the design of Power Flow algorithm that has enhanced performance on the Graphics Processing Unit (GPU) using Compute Unified Device Architecture (CUDA). This work investigates the performance of optimized CPU versions of Newton-Raphson (Polar form) and Gauss-Jacobi power flow algorithms, highlights the approach used to reduce the computation time by performing these […]
Feb, 18

Performance Comparison of Cholesky Decomposition on GPUs and FPGAs

Cholesky decomposition has been widely utilized for positive symmetric matrix factorization in solving least square problems. Various parallel accelerators including GPUs and FPGAs have been explored to improve performance. In this paper, Cholesky decomposition is implemented on both FPGAs and GPUs by designing a dedicated architecture for FPGAs and exploiting massively parallel computation for GPUs. […]
Feb, 17

OpenCL Evaluation for Numerical Linear Algebra Library Development

With the help of of CUDA [7], [6], many applications improved their performance by using GPUs. In our project called Matrix Algebra on GPU and Multicore Architectures (MAGMA) [10], we mainly focus on dense linear algebra routines similar to those from LAPACK [1]. Other than CUDA, there exist other frameworks that allow platformindependent programming for […]
Feb, 17

Evaluating one-sided programming models for GPU cluster computations

The Global Array toolkit (GA) [1] is a powerful framework for implementing algorithms with irregular communication patterns, such as those of quantum chemistry. On the other hand, accelerators such as GPUs have shown great potential for important kernels in quantum chemistry, for example, atomic integral generation [2] and dense linear algebra in correlated methods [3]. […]
Feb, 17

GPU Accelerated Particle System for Triangulated Surface Meshes

Shape analysis based on images and implicit surfaces has been an active area of research for the past several years. Particle systems have emerged as a viable solution to represent shapes for statistical analysis. One of the most widely used representations of shapes in computer graphics and visualization is the triangular mesh. It is desirable […]
Feb, 17

Medium-Grained Functions Mapping using Modern GPUs

The map is a higher-order function that applies a given function to the list or lists of elements producing the list of results. The mapped function is applied to each element of the list independently, thus can be performed for all elements in parallel, making the GPU an interesting platform to be implemented on. Although […]
Feb, 17

Simulations of Large Membrane Regions using GPU-enabled Computations – Preliminary Results

In this short paper we present a GPU code for MD simulations of large membrane regions in the NVT and NVE ensembles with explicit solvent. We give an overview of the code and present preliminary performance results.
Feb, 17

Dynamically scheduled Cholesky factorization on multicore architectures with GPU accelerators

Although the hardware has dramatically changed in the last few years, nodes of multicore chips augmented by Graphics Processing Units (GPUs) seem to be a trend of major importance. Previous approaches for scheduling dense linear operations on such a complex node led to high performance but at the double cost of not using the potential […]
Feb, 17

A Strategy for Automatically Generating High Performance CUDA Code for a GPU Accelerator from a Specialized Fortran Code Expression

Recent microprocessor designs concentrate upon adding cores rather than increasing clock speeds in order to achieve enhanced performance. As a result, in the last few years computational accelerators featuring many cores per chip have begun to appear in high performance scientific computing systems. The IBM Cell processor, with its 9 heterogeneous cores, was the first […]
Feb, 17

Accelerating Algorithms on GPUs in SCIRun: the Conjugate Gradient Case Study

The goal of this research is to integrate graphics processing units (GPUs) into SCIRun, a biomedical problem solving environment, in a way that is transparent to the scientist. We have developed a portable mechanism that allows seamless coexistence of CPU and accelerated GPU computations to provide the best performance while also providing ease of use. […]
Feb, 17

Takagi Factorization on GPU using CUDA

Takagi factorization or symmetric singular value decomposition is a special form of SVD applicable to symmetric complex matrices. The computation takes advantage of symmetry to reduce computation and storage requirements. The Jacobi method with chess tournament ordering was used to perform the computation in parallel on a GPU using the CUDA programming model. We were […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: