1526

Posts

Nov, 9

GPU-based Low Dose CT Reconstruction via Edge-preserving Total Variation Regularization

High radiation dose in CT scans increases a lifetime risk of cancer and has become a major clinical concern. Recently, iterative reconstruction algorithms with Total Variation (TV) regularization have been developed to reconstruct CT images from highly undersampled data acquired at low mAs levels in order to reduce the imaging dose. Nonetheless, TV regularization may […]
Nov, 9

How to obtain efficient GPU kernels: an illustration using FMM & FGT algorithms

Computing on graphics processors is maybe one of the most important developments in computational science to happen in decades. Not since the arrival of the Beowulf cluster, which combined open source software with commodity hardware to truly democratize high-performance computing, has the community been so electrified. Like then, the opportunity comes with challenges. The formulation […]
Nov, 9

General purpose Molecular Dynamics Simulations on GPUs: Issues of Pair Forces and Scaling to large Clusters

We present an implementation of a general purpose GPU-Molecular Dynamics code named LAMMPScuda which is based on LAMMPS. It exhibits excellent scaling behavior, allowing for the efficient usage of hundreds of GPUs for a single simulation. At the same time each GPU provides the equivalent performance of approximately 5 modern Quad Core CPUs. By supporting […]
Nov, 9

Improving many flavor QCD simulations using multiple GPUs

We accelerate many-flavor lattice QCD simulations using multiple GPUs. Multiple pseudo-fermion fields are introduced additively and independently for each flavor in the many-flavor HMC algorithm. Using the independence of each pseudo-fermion field and the blocking technique for the quark solver, we can assign the solver task to each GPU card. In this report we present […]
Nov, 9

Direct N-body simulations of globular clusters: (I) Palomar 14

We present the first ever direct $N$-body computations of an old Milky Way globular cluster over its entire life time on a star-by-star basis. Using recent GPU hardware at Bonn University, we have performed a comprehensive set of $N$-body calculations to model the distant outer halo globular cluster Palomar 14 (Pal 14). By varying the […]
Nov, 9

Parallel Sparse Matrix Solver on the GPU Applied to Simulation of Electrical Machines

Nowadays, several industrial applications are being ported to parallel architectures. In fact, these platforms allow acquire more performance for system modelling and simulation. In the electric machines area, there are many problems which need speed-up on their solution. This paper examines the parallelism of sparse matrix solver on the graphics processors. More specifically, we implement […]
Nov, 9

An Exploration of OpenCL for a Numerical Relativity Application

Currently there is considerable interest in making use of many-core processor architectures, such as Nvidia and AMD graphics processing units (GPUs) for scientific computing. In this work we explore the use of the Open Computing Language (OpenCL) for a typical Numerical Relativity application: a time-domain Teukolsky equation solver (a linear, hyperbolic, partial differential equation solver […]
Nov, 9

Multi GPU Performance of Conjugate Gradient Algorithm with Staggered Fermions

We report results of the performance test of GPUs obtained using the conjugate gradient (CG) algorithm for staggered fermions on the MILC fine lattice ($28^3 times 96$). We use GPUs of nVIDIA GTX 295 model for the test. When we turn off the MPI communication and use only a single GPU, the performance is 35 […]
Nov, 9

SU(2) Lattice QCD Simulations on Fermi GPUs

In this work we explore the performance of CUDA in lattice SU(2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analysis with multiple GPUs and […]
Nov, 9

Implementation of the Neuberger-Dirac operator on GPUs

Recent developments have shown that a lot can be gained for QCD simulations from GPU hardware. This can be exploited especially in the case of Ginsparg-Wilson fermions when the com putational costs are particularly high. In this work, we use the Neuberger-Dirac operator as our realisation of Ginsparg-Wilson fermions, which greatly facilitate lattice investigations of […]
Nov, 9

Staggered fermions simulations on GPUs

We present our implementation of the RHMC algorithm for staggered fermions on Graphics Processing Units using the NVIDIA CUDA programming language. While previous studies exclusively deal with the Dirac matrix inversion problem, our code performs the complete MD trajectory on the GPU. After pointing out the main bottlenecks and how to circumvent them, we discuss […]
Nov, 9

Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics

Graphics Processing Units (GPUs) are having a transformational effect on numerical lattice quantum chromodynamics (LQCD) calculations of importance in nuclear and particle physics. The QUDA library provides a package of mixed precision sparse matrix linear solvers for LQCD applications, supporting single GPUs based on NVIDIA’s Compute Unified Device Architecture (CUDA). This library, interfaced to the […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: