Efficient Implementation of RLS-Based Adaptive Filters on nVIDIA GeForce Graphics Processing Unit
Kanazawa University
Proc. of 27th SIP Symposium
@inproceedings{hirano2012efficient,
title={Efficient Implementation of RLS-Based Adaptive Filterson nVIDIA GeForce Graphics Processing Unit},
author={Hirano, Akihiro and Nakayama, Kenji},
booktitle={第 27 回信号処理シンポジウム講演論文集= Proc. of 27th SIP Symposium},
number={2012},
pages={241–245},
year={2012}
}
This paper presents efficient implementation of RLS-based adaptive filters with a large number of taps on nVIDIA GeForce graphics processing unit (GPU) and CUDA software development environment. Modification of the order and the combination of calculations reduces the number of accesses to slow off-chip memory. Assigning tasks into multiple threads also takes memory access order into account. Multiple shader processor arrays are used to handle a large matrix. For a 8192-tap case, a GPU program is almost 30-times faster than a CPU program. Real-time processing is possible for an 8kHz-sampling and 512-tap case by using 32 shader processors, which is only 25% of GeForce 8800GTS.
September 5, 2013 by hgpu