https://hgpu.org/?p=6421
Compute-unified device architecture implementation of a block-matching algorithm for multiple graphical processing unit cards