GPU-Accelerated Light Stemmer for the Arabic Language
Faculty of Computer and Information Sciences, Ain Shams University, Abasia, Cairo, Egypt
Journal of Computer Science and Applications, Volume 4, Number 2, pp. 105-118, 2012
@article{sophoclis2012gpu,
title={GPU-Accelerated Light Stemmer for the Arabic Language},
author={Sophoclis, N.N. and Abdeen, M. and El-Horbaty, E.S.M.},
year={2012}
}
Preprocessing of data is a vital aspect in information retrieval. Stemming is a major preprocessing task. The goal of stemming is to reduce the inflectional and some of the derivational forms of a word to its base form. Dealing with the massive amounts of data on the web, preprocessing generally consumes a major portion of the execution time of information retrieval application. Thus, improving the performance of text preprocessing generally improves the overall performance of the application. In this paper, we present a novel approach for accelerating the preprocessing of Arabic text through introducing a parallel Arabic light stemmer that runs on a graphics processing unit (GPU). Different optimization techniques are also proposed to improve the efficiency of running the stemming algorithm on the GPU. In order to assess the effect of such optimizations on the performance of the GPU stemmer, we provide multiple OpenCL implementations combining the different techniques. Our implementations are tested on an Nvidia Geforce 310M GPU. Our preliminary experimental results show promising overall speed-up factors over that of the non-GPU sequential approach.
August 21, 2012 by hgpu