https://hgpu.org/?p=8831
Reducing GPU Offload Latency via Fine-Grained CPU-GPU Synchronization