We review the recent optimizations of gravitational N-body kernels for running them on graphics processing units (GPUs), on single hosts and massive parallel platforms. For each of the two main N-body techniques, direct summation and tree-codes, we discuss the optimization strategy, which is different for each algorithm. Because both the accuracy as well as the […]

September 23, 2014 by hgpu

We present a high-performance N-body code for astronomical collisional systems accelerated with the aid of a new SIMD instruction set extension of the x86 architecture: Advanced Vector eXtensions (AVX), an enhanced version of the Streaming SIMD Extensions (SSE). With one processor core of Intel Core i7-2600 processor (8MB cache and 3.40 GHz) based on Sandy […]

April 15, 2011 by hgpu

In this paper I will outline some of the aspects and problems of modern celestial mechanics and stellar dynamics, in the context of the quickly growing computing facilities. I will point the attention on the great advantages in using, for astrophysical simulations, the modern, fast and cheap Graphic Processing Units (GPUs) acting as true supercomputers. […]

November 13, 2010 by hgpu

We study the dynamics of stellar-mass black holes (BH) in star clusters with particular attention to the formation of BH-BH binaries, which are interesting as sources of gravitational waves (GW). We examine the properties of these BH-BH binaries through direct N-body simulations of star clusters using the GPU-enabled NBODY6 code. We perform simulations of N

November 13, 2010 by hgpu

We present an algorithm named “Chamomile Scheme”. The scheme is fully optimized for calculating gravitational interactions on the latest programmable Graphics Processing Unit (GPU), NVIDIA GeForce8800GTX, which has (a) small but fast shared memories (16 K Bytes * 16) with no broadcasting mechanism and (b) floating point arithmetic hardware of 500 Gflop/s but only for […]

November 9, 2010 by hgpu

At the end of 2006 NVIDIA introduced a new generation of graphical processing units (GPUs) (the so called G80 architecture). These GPUs are more powerful than any of the GPUs released before; they offer up to 350 billion floating-point operations per second (GFLOP/s) in certain situations. With the introduction of this hardware NVIDIA released a […]

November 9, 2010 by hgpu

We present the results of gravitational direct $N$-body simulations using the commercial graphics processing units (GPU) NVIDIA Quadro FX1400 and GeForce 8800GTX, and compare the results with GRAPE-6Af special purpose hardware. The force evaluation of the $N$-body problem was implemented in Cg using the GPU directly to speed-up the calculations. The integration of the equations […]

November 9, 2010 by hgpu

Commercial graphics processors (GPUs) have high compute capacity at very low cost, which makes them attractive for general purpose scientific computing. In this paper we show how graphics processors can be used for N-body simulations to obtain improvements in performance over current generation CPUs. We have developed a highly optimized algorithm for performing the O(N^2) […]

November 8, 2010 by hgpu

In this paper, we describe the architecture and performance of the GraCCA system, a Graphic-Card Cluster for Astrophysics simulations. It consists of 16 nodes, with each node equipped with 2 modern graphic cards, the NVIDIA GeForce 8800 GTX. This computing cluster provides a theoretical performance of 16.2 TFLOPS. To demonstrate its performance in astrophysics computation, […]

November 8, 2010 by hgpu

We present and discuss the characteristics and performances, both in term of computational speed and precision, of a numerical code which numerically integrates the equation of motions of N ‘particles’ interacting via Newtonian gravitation and move in an external galactic smooth field. The force evaluation on every particle is done by mean of direct summation […]

November 4, 2010 by hgpu

We present the results of gravitational direct N-body simulations using the graphics processing unit (GPU) on a commercial NVIDIA GeForce 8800GTX designed for gaming computers. The force evaluation of the N -body problem is implemented in “Compute Unified Device Architecture” (CUDA) using the GPU to speedup the calculations. We tested the implementation on three different […]

October 30, 2010 by hgpu