https://hgpu.org/?p=3532
FFT Implementation on a Streaming Architecture