https://hgpu.org/?p=11704
Implementation of Parallel Fast Hartley Transform (FHT) Using Cuda