high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Efficient Probabilistic Latent Semantic Indexing using Graphics Processing Unit

Efficient Probabilistic Latent Semantic Indexing using Graphics Processing Unit

Eli Koffi Kouassi, Toshiyuki Amagasaa, Hiroyuki Kitagawa

Graduate School of Systems and Information Engineering, University of Tsukuba 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8573, JAPAN

Procedia Computer Science, Volume 4, 2011, Pages 382-391, Proceedings of the International Conference on Computational Science (ICCS), 2011

DOI:10.1016/j.procs.2011.04.040

@article{kouassi2011efficient,

title={Efficient Probabilistic Latent Semantic Indexing using Graphics Processing Unit},

author={Kouassi, E.K. and Amagasa, T. and Kitagawa, H.},

journal={Procedia Computer Science},

volume={4},

pages={382–391},

year={2011},

publisher={Elsevier BV}

}

Download (PDF)

View

Source

1409

views

In this paper, we propose a scheme to accelerate the Probabilistic Latent Semantic Indexing (PLSI), which is an automated document indexing method based on a statistical latent semantic model, exploiting the high parallelism of Graphics Processing Unit (GPU). Our proposal is composed of three techniques: the first one is to accelerate the Expectation-Maximization (EM) computation by applying GPU matrix-vector multiplication; the second one uses the same principles as the first method, but deals with the sparseness of co-occurrence of words and documents; and the third one is to use the concurrent kernel execution, which is available on NVIDIA Fermi architecture, in order to speed up the process. We compare the performance of the proposed scheme with the non-parallelized implementation. The results show that our method could be more than 100 times faster than the CPU-based implementation in our environment. By dealing with the sparseness of the data, we could not only process more documents and words using GPU, but we could also keep more data on the device memory so that we can avoid massive data copy transfer between the host and the device susceptible to reduce the execution performance.

Tags: Algorithms, Computer science, CUDA, Data mining, nVidia, Semantic indexing, Tesla C2050

October 26, 2011 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

Efficient Probabilistic Latent Semantic Indexing using Graphics Processing Unit

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)

Efficient Probabilistic Latent Semantic Indexing using Graphics Processing Unit

Share this:

Recent source codes

Most viewed papers (last 30 days)