A computationally efficient and scalable approach for privacy preserving kNN classification
Sri Sathya Sai Institute of Higher Learning, Prashanthi Nilayam, India
IEEE International Conference on High Performance Computing (HiPC), 2012
@article{ravu2012computationally,
title={A computationally efficient and scalable approach for privacy preserving kNN classification},
author={Ravu, S. and Neelakandan, PR and Gorai, MR and Mukkamala, R. and Baruah, PK},
year={2012}
}
In the modern age, there is a great desire to mine users’ personal data from varied sources, to discover their behaviours. However, due to the growing awareness among the organizations regarding the privacy of user data and the strict privacy regulations of government, there is a growing resistance to share data directly with others. Encryption is used in the literature to achieve privacy preservation in data mining. Our technique is based on the application of Bloom filters on the sensitive data while still being able to perform collaborative data mining, in particular the kNN classification. In this work, we propose a parallel implementation on GPUs of the most time consuming part of the algorithm, i.e., the similarity computation of the Bloom filtered records based on the modified Jaccard metric and the classification of records. From our findings, we conclude that the proposed parallel implementation, apart from being cost effective, is highly scalable to accommodate huge data. The parallel implementation has an average speed up of 20 over serial implementation. Further, the speed up increases with increase in the size of the data set considered.
December 8, 2012 by hgpu