Multi-core CUDA Architecture for Parallelization of Hierarchical Text Clustering

Debasis Pattanaik, Ajaya Kumar Malik, Pradeep Kumar Mallick
Dept. of Computer Science & Applications, ME CSE (KE), Utkal University, Bhubaneswar-751004, India
International Journal on Advanced Computer Theory and Engineering, Volume 2, Issue 4, 2013


   title={Multi-core CUDA Architecture for Parallelization of Hierarchical Text Clustering},

   author={Pattanaik, Debasis and Malik, Ajaya Kumar and Mallick, Pradeep Kumar},



Download Download (PDF)   View View   Source Source   



Text Clustering is the problem of dividing text documents into groups, such that documents in same group are similar to one another and different from documents in other groups. Because of the general tendency of texts forming hierarchies, text clustering is best performed by using a hierarchical clustering method. An important aspect while clustering large text databases is that of high dimensionality of the representation space. Not only does it take lot of space in storing hierarchy trees but also a lot of time is spent in similarity calculations while clustering these documents. In this paper we propose to parallelize a method which uses a tree based summarization technique to store cluster summaries in a tree stored in the memory at all times of processing. The results show that our method shows good accuracy along with a good speed up in calculating clusters.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: