high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Rank k Cholesky Up/Down-dating on the GPU: gpucholmodV0.2

Rank k Cholesky Up/Down-dating on the GPU: gpucholmodV0.2

Christian Walder

Informatics and Mathematical Modelling, Technical University of Denmark, DK-2800

arXiv:1011.1173 [cs.DC] (4 Nov 2010)

BibTeX

Download (PDF)

View

Source

Source codes

Package:

gpucholmod

1692

views

In this note we briefly describe our Cholesky modification algorithm for streaming multiprocessor architectures. Our implementation is available in C++ with Matlab binding, using CUDA to utilise the graphics processing unit (GPU). Limited speed ups are possible due to the bandwidth bound nature of the problem. Furthermore, a complex dependency pattern must be obeyed, requiring multiple kernels to be launched. Nonetheless, this makes for an interesting problem, and our approach can reduce the computation time by a factor of around 7 for matrices of size 5000 by 5000 and k=16, in comparison with the LINPACK suite running on a CPU of comparable vintage. Much larger problems can be handled however due to the O(n) scaling in required GPU memory of our method.

Tags: Algorithms, Computer science, CUDA, nVidia, Package, Tesla C2050

November 9, 2010 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

Rank k Cholesky Up/Down-dating on the GPU: gpucholmodV0.2

Package:

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Rank k Cholesky Up/Down-dating on the GPU: gpucholmodV0.2

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)