https://hgpu.org/?p=8571
Optimizing 3D Convolutions for Wavelet Transforms on CPUs with SSE Units and GPUs