{"id":13879,"date":"2015-04-17T00:11:01","date_gmt":"2015-04-16T21:11:01","guid":{"rendered":"http:\/\/hgpu.org\/?p=13879"},"modified":"2015-04-17T00:11:01","modified_gmt":"2015-04-16T21:11:01","slug":"nbody6gpu-ready-for-the-gravitational-million-body-problem","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=13879","title":{"rendered":"NBODY6++GPU: Ready for the gravitational million-body problem"},"content":{"rendered":"<p>Accurate direct N-body simulations help to obtain detailed information about the dynamical evolution of star clusters. They also enable comparisons with analytical models and Fokker-Planck or Monte-Carlo methods. NBODY6 is a well-known direct N-body code for star clusters, and NBODY6++ is the extended version designed for large particle number simulations by supercomputers. We present NBODY6++GPU, an optimized version of NBODY6++ with hybrid parallelization methods (MPI, GPU, OpenMP, and AVX\/SSE) to accelerate large direct N-body simulations, and in particular to solve the million-body problem. We discuss the new features of the NBODY6++GPU code, benchmarks, as well as the first results from a simulation of a realistic globular cluster initially containing a million particles. For million-body simulations, NBODY6++GPU is 400-2000 times faster than NBODY6 with 320 CPU cores and 32 NVIDIA K20X GPUs. With this computing cluster specification, the simulations of million-body globular clusters including 5% primordial binaries require about an hour per half-mass crossing time.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Accurate direct N-body simulations help to obtain detailed information about the dynamical evolution of star clusters. They also enable comparisons with analytical models and Fokker-Planck or Monte-Carlo methods. NBODY6 is a well-known direct N-body code for star clusters, and NBODY6++ is the extended version designed for large particle number simulations by supercomputers. We present NBODY6++GPU, [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[96,89,3],"tags":[1794,14,97,20,176,1390],"class_list":["post-13879","post","type-post","status-publish","format-standard","hentry","category-astrophysics","category-nvidia-cuda","category-paper","tag-astrophysics","tag-cuda","tag-instrumentation-and-methods-for-astrophysics","tag-nvidia","tag-package","tag-tesla-k20"],"views":2594,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/13879","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13879"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/13879\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13879"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13879"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13879"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}