high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Physics » Astrophysics » High Performance Direct Gravitational N-body Simulations on Graphics Processing Unit I: An implementation in Cg

High Performance Direct Gravitational N-body Simulations on Graphics Processing Unit I: An implementation in Cg

Simon Portegies Zwart, Robert Belleman, Peter Geldof

Section Computational Science, University of Amsterdam, Amsterdam, The Netherlands

New Astronomy, Volume 12, Issue 8, p. 641-650, arXiv:astro-ph/0702058 (2 Feb 2007)

DOI:10.1016/j.newast.2007.05.004

BibTeX

Download (PDF)

View

Source

Source codes

Package:

N-body simulations

2210

views

We present the results of gravitational direct $N$-body simulations using the commercial graphics processing units (GPU) NVIDIA Quadro FX1400 and GeForce 8800GTX, and compare the results with GRAPE-6Af special purpose hardware. The force evaluation of the $N$-body problem was implemented in Cg using the GPU directly to speed-up the calculations. The integration of the equations of motions were, running on the host computer, implemented in C using the 4th order predictor-corrector Hermite integrator with block time steps. We find that for a large number of particles ($N apgt 10^4$) modern graphics processing units offer an attractive low cost alternative to GRAPE special purpose hardware. A modern GPU continues to give a relatively flat scaling with the number of particles, comparable to that of the GRAPE. The GRAPE is designed to reach double precision, whereas the GPU is intrinsically single-precision. For relatively large time steps, the total energy of the N-body system was conserved better than to one in $10^6$ on the GPU, which is impressive given the single-precision nature of the GPU. For the same time steps, the GRAPE gave somewhat more accurate results, by about an order of magnitude. However, smaller time steps allowed more energy accuracy on the grape, around $10^{-11}$, whereas for the GPU machine precision saturates around $10^{-6}$ For $Napgt 10^6$ the GeForce 8800GTX was about 20 times faster than the host computer. Though still about a factor of a few slower than GRAPE, modern GPUs outperform GRAPE in their low cost, long mean time between failure and the much larger onboard memory; the GRAPE-6Af holds at most 256k particles whereas the GeForce 8800GTX can hold 9 million particles in memory.

Tags: Astrophysics, Cg, N-body simulation, nVidia, nVidia GeForce 8800 GTX, nVidia Quadro FX 1400

November 13, 2010 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

High Performance Direct Gravitational N-body Simulations on Graphics Processing Unit I: An implementation in Cg

Package:

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

High Performance Direct Gravitational N-body Simulations on Graphics Processing Unit I: An implementation in Cg

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)