Reducing data access latency in SDSM systems using runtime optimizations
Universitat Politecnica de Catalunya, Barcelona, Spain
Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research, CASCON ’10, 2010
@inproceedings{bueno2010reducing,
title={Reducing data access latency in SDSM systems using runtime optimizations},
author={Bueno, J. and Martorell, X. and Costa, J.J. and Cort{‘e}s, T. and Ayguad{‘e}, E. and Zhang, G. and Barton, C. and Silvera, R.},
booktitle={Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research},
pages={160–173},
year={2010},
organization={ACM}
}
Software Distributed Shared Memory (SDSM) systems offer a convenient way to run applications developed for shared memory systems on distributed systems with no changes to them. However, since SDSM systems add an extra layer of abstraction to the memory hierarchy, applications may suffer performance problems when running on top of them. Our main research interest is to develop a set of compiler and runtime system techniques that widen the range of applications that can efficiently run on SDSM systems. Currently we are targeting OpenMP applications due to the ease of use this programming model provides. In this paper we show the performance of a set of regular applications that perform well on our SDSM system. They were adapted from OpenCL codes provided by ATI, and re-written in OpenMP. When trying to exploit more complex applications with different data access patterns, we find more difficulties from a DSM system. As an example, we show the performance evaluation of the NAS MG benchmark, and two techniques we have developed to improve its data locality. Our SDSM infrastructure is composed of NanosDSM, an everything-shared SDSM developed at the Technical University of Catalonia (UPC) and the Barcelona Supercomputing Center (BSC), and the IBM XL SMP Runtime to allow the execution of the OpenMP applications.
August 21, 2011 by hgpu