https://hgpu.org/?p=11397
Accelerator Aware MPI Micro-benchmarking using CUDA, OpenACC and OpenCL