An Efficient Implementation of Double Precision 1-D FFT for GPUs Using CUDA

Yanjun Liu, Licai Guo, Bin Luo, Xingyi Zhang
School of Computer Science and Technology, Anhui University, Hefei 230039, China
Journal of Information & Computational Science 9: 2 (2012) 387-394, 2012


   title={An Efficient Implementation of Double Precision 1-D FFT for GPUs Using CUDA},

   author={Liua, Y. and Guoc, L. and Luoa, B. and Zhanga, X.},

   journal={Journal of Information & Computational Science},






Download Download (PDF)   View View   Source Source   



Fast Fourier Transform (FFT) is a well known and widely used tool in many scientific and engineering fields. CUFFT, which is the NVIDIA’s FFT library included in the CUDA toolkit, supports double precision FFTs. However, the implementation of CUFFT is not very efficient. In this paper, we implement an efficient double-precision Cooley-tukey algorithm for GPUs using CUDA. Some programming techniques are employed to exploit the hardware characteristics. These techniques include on-chip shared memory utilization, removing redundant computation, and coalescing the global memory access. Experiments show that the performance of our 1-D FFT is as fast as CUFFT. Furthermore, the performance of our FFT implementation is more than twice faster than CUFFT for small input sizes.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: