Three Dimensional Fast Fourier Transform CUDA Implementation

Kumar Aatish, Boyan Zhang
Department of Computer Science and Engineering, University of California, San Diego
University of California, CSE260 Project Report, 2012


   title={Three Dimensional Fast Fourier Transform CUDA Implementation},

   author={Aatish, Kumar and Zhang, Boyan},



Download Download (PDF)   View View   Source Source   



A 3 dimensional DFT can be expressed as 3 DFTs on a 3 dimensional data along each dimension. Each of these 1 dimensional DFTs can be computed efficiently owing to the properties of the transform. This class of algorithms is known as the Fast Fourier Transform (FFT). We introduce the one dimensional FFT algorithm in this section, which will be used in our GPU implementation. Our implementation does an FFT transform in the row major dimension of a given three dimensional matrix at a time. Thus, the complete 3D FFT is a set of 1D FFT kernels and transpose kernels which bring a desired coordinate axis to the row major format to enable coalesced global reads. The implemented kernel performs a single precision 1D FFT and uses the fast math functions for calculating the sin and cos of the phases corresponding to twiddle factors. The GPU used is the C2050 Fermi GPU on the DIRAC cluster on the Carver system provided by the National Energy Research Scientific Computing Center. The NVIDIA Tesla C2050 has 448 parallel CUDA processor cores with 3 GB of memory.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: