https://hgpu.org/?p=5493
CUDA-level performance with python-level productivity for Gaussian mixture model applications