A Highly Parallel Reuse Distance Analysis Algorithm on GPUs
SKL Computer Architecture, Institute of Computing Technology, CAS, Beijing, China
26th IEEE International Parallel and Distributed Processing Symposium (IPDPS’12), 2012
@article{cui2012highly,
title={A Highly Parallel Reuse Distance Analysis Algorithm on GPUs},
author={Cui, Huimin and Yi, Qing and Xue, Jingling and Wang, Lei and Yang, Yang and Feng, Xiaobing},
year={2012}
}
Reuse distance analysis is a runtime approach that has been widely used to accurately model the memory system behavior of applications. However, traditional reuse distance analysis algorithms use tree-based data structures and are hard to parallelize, missing the tremendous computing power of modern architectures such as the emerging GPUs. This paper presents a highly-parallel reuse distance analysis algorithm (HP-RDA) to speedup the process using the SPMD execution model of GPUs. In particular, we propose a hybrid data structure of hash table and local arrays to flatten the traditional tree representation of memory access traces. Further, we use a probabilistic model to correct any loss of precision from a straightforward parallelization of the original sequential algorithm. Our experimental results show that using an NVIDIA GPU, our algorithm achieves a factor of 20 speedup over the traditional sequential algorithm with less than 1% loss in precision.
March 30, 2012 by hgpu