https://hgpu.org/?p=10670
Performance Analysis of a Large Memory Application on Multiple Architectures