4730

Implementation of large-scale FIR adaptive filters on NVIDIA GeForce graphics processing unit

Akihiro Hirano, Kenji Nakayama
Kanazawa University, Kakuma-Machi, Kanazawa, 920-1192, Japan
International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), 2010
BibTeX

Download Download (PDF)   View View   Source Source   

1939

views

This paper presents implementations of an FIR adaptive filter with a large number of taps on nVIDIA GeForce graphics processing unit (GPU) and CUDA software development environment. In order to overcome a long access latency for slow off-chip memory access, reduction of memory accesses by re-ordering and vector load/store operations and an increase of the number of threads are introduced. A tree adder is introduced to reduce the cost for summing thread outputs up. A simultaneous execution of multiple filters are also examined. On low-cost platform such as an Atom/ION nettop, GPU will accelerates the computation by almost three times. For simultaneous multiple simulations such as an ensemble averaging, a GPU with a large number of processing elements outperforms a dual-core CPU; almost six times faster for 16 runs.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org