https://hgpu.org/?p=6331
Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs