7978

A virtual memory based runtime to support multi-tenancy in clusters with GPUs

Michela Becchi, Kittisak Sajjapongse, Ian Graves, Adam Procter, Vignesh Ravi, Srimat Chakradhar
University of Missouri
21st international symposium on High-Performance Parallel and Distributed Computing (HPDC ’12), 2012

@inproceedings{becchi2012virtual,

   title={A virtual memory based runtime to support multi-tenancy in clusters with GPUs},

   author={Becchi, M. and Sajjapongse, K. and Graves, I. and Procter, A. and Ravi, V. and Chakradhar, S.},

   booktitle={Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing},

   pages={97–108},

   year={2012},

   organization={ACM}

}

Download Download (PDF)   View View   Source Source   

1917

views

Graphics Processing Units (GPUs) are increasingly becoming part of HPC clusters. Nevertheless, cloud computing services and resource management frameworks targeting heterogeneous clusters including GPUs are still in their infancy. Further, GPU software stacks (e.g., CUDA driver and runtime) currently provide very limited support to concurrency. In this paper, we propose a runtime system that provides abstraction and sharing of GPUs, while allowing isolation of concurrent applications. A central component of our runtime is a memory manager that provides a virtual memory abstraction to the applications. Our runtime is flexible in terms of scheduling policies, and allows dynamic (as opposed to programmer-defined) binding of applications to GPUs. In addition, our framework supports dynamic load balancing, dynamic upgrade and downgrade of GPUs, and is resilient to their failures. Our runtime can be deployed in combination with VM-based cloud computing services to allow virtualization of heterogeneous clusters, or in combination with HPC cluster resource managers to form an integrated resource management infrastructure for heterogeneous clusters. Experiments conducted on a three-node cluster show that our GPU sharing scheme allows up to a 28% and a 50% performance improvement over serialized execution on short- and long-running jobs, respectively. Further, dynamic inter-node load balancing leads to an additional 18-20% performance benefit.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: