https://hgpu.org/?p=24345
Fast CUDA-Aware MPI Datatypes without Platform Support