14058

Efficient FFT mapping on GPU for radar processing application: modeling and implementation

Mohamed Amine Bergach, Emilien Kofman, Robert de Simone, Serge Tissot, Michel Syska
Inria
arXiv:1505.08067 [cs.MS], (29 May 2015)

@article{bergach2015efficient,

   archivePrefix={arXiv},

   author={Bergach, Mohamed Amine and Kofman, Emilien and Simone, Robert de and Tissot, Serge and Syska, Michel},

   month={may},

   primaryClass={cs.MS},

   title={Efficient FFT mapping on GPU for radar processing application: modeling and implementation},

   year={2015}

}

Download Download (PDF)   View View   Source Source   

1996

views

General-purpose multiprocessors (as, in our case, Intel IvyBridge and Intel Haswell) increasingly add GPU computing power to the former multicore architectures. When used for embedded applications (for us, Synthetic aperture radar) with intensive signal processing requirements, they must constantly compute convolution algorithms, such as the famous Fast Fourier Transform. Due to its "fractal" nature (the typical butterfly shape, with larger FFTs defined as combination of smaller ones with auxiliary data array transpose functions), one can hope to compute analytically the size of the largest FFT that can be performed locally on an elementary GPU compute block. Then, the full application must be organized around this given building block size. Now, due to phenomena involved in the data transfers between various memory levels across CPUs and GPUs, the optimality of such a scheme is only loosely predictable (as communications tend to overcome in time the complexity of computations). Therefore a mix of (theoretical) analytic approach and (practical) runtime validation is here needed. As we shall illustrate, this occurs at both stage, first at the level of deciding on a given elementary FFT block size, then at the full application level.
Rating: 1.9/5. From 8 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: