https://hgpu.org/?p=10872
Tiled QR Decomposition and Its Optimization on CPU and GPU Computing System