https://hgpu.org/?p=16702
Optimization and parallelization of B-spline based orbital evaluations in QMC on multi/many-core shared memory processors