https://hgpu.org/?p=1368
The Chamomile Scheme: An Optimized Algorithm for N-body simulations on Programmable Graphics Processing Units