https://hgpu.org/?p=2451
Understanding the design trade-offs among current multicore systems for numerical computations