Computing Spatial Distance Histograms for Large Scientific Datasets On-the-Fly
Department of Computer Science and Engineering, University of South Florida, 4202 E. Fowler Ave., ENB 118, Tampa, FL 33620, U.S.A.
IEEE Transactions on Knowledge and Data Engineering, 2014
@article{kumar2014computing,
title={Computing Spatial Distance Histograms for Large Scientific Datasets On-the-Fly},
author={Kumar, Anand and Grupcev, Vladimir and Yuan, Yongke and Huang, Jin and Tu, Yi-Cheng and Shen, Gang},
year={2014}
}
This paper focuses on an important query in scientific simulation data analysis: the Spatial Distance Histogram (SDH). The computation time of an SDH query using brute force method is quadratic. Often, such queries are executed continuously over certain time periods, increasing the computation time. We propose highly efficient approximate algorithm to compute SDH over consecutive time periods with provable error bounds. The key idea of our algorithm is to derive statistical distribution of distances from the spatial and temporal characteristics of particles. Upon organizing the data into a Quad-tree based structure, the spatio-temporal characteristics of particles in each node of the tree are acquired to determine the particles’ spatial distribution as well as their temporal locality in consecutive time periods. We report our efforts in implementing and optimizing the above algorithm in Graphics Processing Units (GPUs) as means to further improve the efficiency. The accuracy and efficiency of the proposed algorithm is backed by mathematical analysis and results of extensive experiments using data generated from real simulation studies.
January 18, 2014 by hgpu