Conjugate gradient solver

hgpu.org » Conjugate gradient solver

Block Conjugate Gradient Solver in OpenCL

Kert Tali, Eero Vainikko

View

Download (PDF)

Tags: Algorithms, Computer science, Conjugate gradient solver, Linear Algebra, nVidia, nVidia GeForce RTX 2080, OpenCL

July 11, 2021 by hgpu

Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations

Hartwig Anzt, Marc Baboulin, Jack Dongarra, Yvan Fournier, Frank Hulsemann, Amal Khabou, Yushan Wang

View

Download (PDF)

Tags: cfd, Conjugate gradient solver, CUDA, Fluid dynamics, nVidia, OpenCL, Tesla K40

July 18, 2016 by hgpu

GPU acceleration of preconditioned solvers for ill-conditioned linear systems

Rohit Gupta

View

Download (PDF)

Tags: Computer science, Conjugate gradient solver, MPI, nVidia, Tesla C2070, Thesis

October 11, 2015 by hgpu

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Amani AlOnazi, David Keyes, Alexey Lastovetsky, Vladimir Rychkov

View

Download (PDF)

Tags: Conjugate gradient solver, CUDA, Fluid dynamics, Heterogeneous systems, MPI, nVidia, nVidia GeForce GTX Titan

May 29, 2015 by hgpu

Conjugate gradient solvers on Intel Xeon Phi and NVIDIA GPUs

O. Kaczmarek, C. Schmidt, P. Steinbrecher, M. Wagner

View

Download (PDF)

Tags: Computational Physics, Conjugate gradient solver, CUDA, High Energy Physics - Lattice, Intel Xeon Phi, Mathematical Software, nVidia, Physics, Tesla K20, Tesla K40

November 18, 2014 by hgpu

Pipelined Iterative Solvers with Kernel Fusion for Graphics Processing Units

Karl Rupp, Josef Weinbub, Ansgar Jungel, Tibor Grasser

View

Download (PDF)

Source codes

Tags: AMD FirePro W9000, AMD FirePro W9100, ATI, Computer science, Conjugate gradient solver, CUDA, Linear Algebra, nVidia, OpenCL, Tesla C2050, Tesla K20

October 16, 2014 by hgpu

HISQ inverter on Intel Xeon Phi and NVIDIA GPUs

O. Kaczmarek, C. Schmidt, P. Steinbrecher, Swagato Mukherjee, M. Wagner

View

Download (PDF)

Tags: Conjugate gradient solver, High Energy Physics - Lattice, Intel Xeon Phi, nVidia, nVidia GeForce GTX Titan, Physics, QCD, Sparse matrix, Tesla K20, Tesla K40

September 5, 2014 by hgpu

Parallel technologies for solving system of the linear equations by the conjugate gradient method

Eduard Bondarenko

View

Download (PDF)

Tags: Computer science, Conjugate gradient solver, CUBLAS, nVidia, OpenACC, OpenMP, Tesla M2070

June 14, 2014 by hgpu

Heterogeneous Computing for Solving System of the Linear Equations by the Conjugate Gradient Method

Eduard Bondarenko

View

Download (PDF)

Tags: Computer science, Conjugate gradient solver, CUBLAS, CUDA, Linear Algebra, nVidia, nVidia GeForce GTX 650 Ti, OpenACC

May 29, 2014 by hgpu

An Approach to Efficient FEM Simulations on Graphics Processing Units Using CUDA

Bjorn Nutti, Dragan Marinkovic

View

Download (PDF)

Tags: Algorithms, Computer science, Conjugate gradient solver, CUDA, FEM, Finite element method, nVidia, nVidia GeForce 8800 GTX

April 14, 2014 by hgpu

Efficient Preconditioned Conjugate Gradient Parallelization on GPU

A. F. P. Camargos, V. C. Silva

View

Download (PDF)

Tags: Conjugate gradient solver, CUDA, Electrodynamics, Factorization, FEM, Finite element method, nVidia, nVidia GeForce GT 240

March 12, 2014 by hgpu

Design and Optimization of OpenFOAM-based CFD Applications for Modern Hybrid and Heterogeneous HPC Platforms

Amani AlOnazi

View

Download (PDF)

Tags: Algorithms, Conjugate gradient solver, CUDA, Fluid dynamics, Heterogeneous systems, Laplace and Poisson equation, Linear Algebra, MPI, Numerical simulation, nVidia, Tesla C2050, Thesis

December 21, 2013 by hgpu

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

SimSYCL: A SYCL Implementation Targeting Development, Debugging, Simulation and Conformance

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

94% on CIFAR-10 in 3.29 Seconds on a Single GPU

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers

OpenMC Monte Carlo Code

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs

Polygeist: C/C++ frontend for MLIR

Retargeting and Respecializing GPU Workloads for Performance Portability

Parallel Gaussian process with kernel approximation in CUDA

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Block Conjugate Gradient Solver in OpenCL

Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations

GPU acceleration of preconditioned solvers for ill-conditioned linear systems

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Conjugate gradient solvers on Intel Xeon Phi and NVIDIA GPUs

Pipelined Iterative Solvers with Kernel Fusion for Graphics Processing Units

HISQ inverter on Intel Xeon Phi and NVIDIA GPUs

Parallel technologies for solving system of the linear equations by the conjugate gradient method

Heterogeneous Computing for Solving System of the Linear Equations by the Conjugate Gradient Method

An Approach to Efficient FEM Simulations on Graphics Processing Units Using CUDA

Efficient Preconditioned Conjugate Gradient Parallelization on GPU

Design and Optimization of OpenFOAM-based CFD Applications for Modern Hybrid and Heterogeneous HPC Platforms

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)