https://hgpu.org/?p=14794
Exploring Optimisations for the Local Assembly phase of Finite Element Methods on GPUs