N-body Simulation for Astronomical Collisional Systems with a New SIMD Instruction Set Extension to the x86 Architecture, Advanced Vector Extensions

hgpu.org » Programming » Algorithms » N-body Simulation for Astronomical Collisional Systems with a New SIMD Instruction Set Extension to the x86 Architecture, Advanced Vector Extensions

N-body Simulation for Astronomical Collisional Systems with a New SIMD Instruction Set Extension to the x86 Architecture, Advanced Vector Extensions

Ataru Tanikawa, Kohji Yoshikawa, Takashi Okamoto, Keigo Nitadori

Center for Computational Science, University of Tsukuba, 1-1-1, Tennodai, Tsukuba, Ibaraki 305-8577, Japan

arXiv:1104.2700 [astro-ph.IM] (14 Apr 2011)

BibTeX

Download (PDF)

View

Source

1941

views

We present a high-performance N-body code for astronomical collisional systems accelerated with the aid of a new SIMD instruction set extension of the x86 architecture: Advanced Vector eXtensions (AVX), an enhanced version of the Streaming SIMD Extensions (SSE). With one processor core of Intel Core i7-2600 processor (8MB cache and 3.40 GHz) based on Sandy Bridge micro-architecture, we achieved the performance of ~ 20 giga floating point number operations per second (GFlops) for double-precision accuracy, which is two times and five times higher than that of the previously developed code implemented with the SSE instructions (Nitadori et al., 2006b), and that of a code implemented without any explicit use of SIMD instructions with the same processor core. We have parallelized the collisional N-body code by using so-called NINJA scheme (Nitadori et al., 2006a), and achieved ~ 90 GFlops for a system containing more than N = 8192 particles with 8 MPI processes on four cores. We can expect to achieve about 10 tera Flops (TFlops) for an astronomical collisional system with N ~ 10^5 on massively parallel systems with at most 800 cores with Sandy Bridge micro-architecture. This performance will be comparable to that of Graphic Processing Unit (GPU) cluster systems. This paper offers an alternative to collisional N-body simulations with GRAPEs and GPUs.

Tags: Algorithms, Astrophysics, Instrumentation and Methods for Astrophysics, N-body simulation, Stellar dynamics

April 15, 2011 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

N-body Simulation for Astronomical Collisional Systems with a New SIMD Instruction Set Extension to the x86 Architecture, Advanced Vector Extensions

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

N-body Simulation for Astronomical Collisional Systems with a New SIMD Instruction Set Extension to the x86 Architecture, Advanced Vector Extensions

Share this:

Recent source codes

Most viewed papers (last 30 days)