Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation
College of William and Mary, Williamsburg VA 23187, USA
International Workshop on Languages and Compilers for Parallel Computing. (LCPC 2011), 2011
@inproceedings{guo2011fine,
title={Fine-grained treatment to synchronizations in gpu-to-cpu translation},
author={Guo, Z. and Shen, X.},
booktitle={Proc. of the Workshop on Languages and Compilers for Parallel Computing},
year={2011}
}
GPU-to-CPU translation may extend Graphics Processing Units (GPU) programs executions to multi-/many-core CPUs, and hence enable cross-device task migration and promote whole-system synergy. This paper describes some of our findings in treatment to GPU synchronizations during the translation process. We show that careful dependence analysis may allow a fine-grained treatment to synchronizations and reveal redundant computation at the instruction-instance level. Based on thread-level dependence graphs, we present a method to enable such fine-grained treatment automatically. Experiments demonstrate that compared to existing translations, the new approach can yield speedup of a factor of integers.
April 4, 2012 by hgpu