https://hgpu.org/?p=4072
From Sparse Matrix to Optimal GPU CUDA Sparse Matrix Vector Product Implementation