https://hgpu.org/?p=17135
Parallel Multi Channel Convolution using General Matrix Multiplication