https://hgpu.org/?p=11609
Locality optimization on a NUMA architecture for hybrid LU factorization