Progressive Clustering of Big Data with GPU Acceleration and Visualization

Jun Wang, Eric Papenhausen, Bing Wang, Sungsoo Ha, Alla Zelenyuk, Klaus Mueller
Visual Analytics and Imaging Lab, Computer Science Department, Stony Brook University, Stony Brook, NY, USA
New York Scientific Data Summit (NYSDS), 2017


   title={Progressive Clustering of Big Data with GPU Acceleration and Visualization},

   author={Wang, Jun and Papenhausen, Eric and Wang, Bing and Ha, Sungsoo and Zelenyuk, Alla and Mueller, Klaus},



Download Download (PDF)   View View   Source Source   



Clustering has become an unavoidable step in big data analysis. It may be used to arrange data into a compact format, making operations on big data manageable. However, clustering of big data requires not only the capability of handling data with large volume and high dimensionality, but also the ability to process streaming data, all of which are less developed in most current algorithms. Furthermore, big data processing is seldom interactive, which stands at conflict with users who seek answers immediately. The best one can do is to process incrementally, such that partial and, hopefully, accurate results can be available relatively quickly and are then progressively refined over time. We propose a clustering framework which uses Multi-Dimensional Scaling for layout and GPU acceleration to accomplish these goals. Our domain application is the clustering of mass spectral data of individual aerosol particles with 8 million data points of 450 dimensions each.
Rating: 5.0/5. From 1 vote.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: