A virtual memory based runtime to support multi-tenancy in clusters with GPUs

hgpu.org » Applications » Computer science » A virtual memory based runtime to support multi-tenancy in clusters with GPUs

A virtual memory based runtime to support multi-tenancy in clusters with GPUs

Michela Becchi, Kittisak Sajjapongse, Ian Graves, Adam Procter, Vignesh Ravi, Srimat Chakradhar

University of Missouri

21st international symposium on High-Performance Parallel and Distributed Computing (HPDC ’12), 2012

DOI:10.1145/2287076.2287090

@inproceedings{becchi2012virtual,

title={A virtual memory based runtime to support multi-tenancy in clusters with GPUs},

author={Becchi, M. and Sajjapongse, K. and Graves, I. and Procter, A. and Ravi, V. and Chakradhar, S.},

booktitle={Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing},

pages={97–108},

year={2012},

organization={ACM}

}

Download (PDF)

View

Source

1774

views

Graphics Processing Units (GPUs) are increasingly becoming part of HPC clusters. Nevertheless, cloud computing services and resource management frameworks targeting heterogeneous clusters including GPUs are still in their infancy. Further, GPU software stacks (e.g., CUDA driver and runtime) currently provide very limited support to concurrency. In this paper, we propose a runtime system that provides abstraction and sharing of GPUs, while allowing isolation of concurrent applications. A central component of our runtime is a memory manager that provides a virtual memory abstraction to the applications. Our runtime is flexible in terms of scheduling policies, and allows dynamic (as opposed to programmer-defined) binding of applications to GPUs. In addition, our framework supports dynamic load balancing, dynamic upgrade and downgrade of GPUs, and is resilient to their failures. Our runtime can be deployed in combination with VM-based cloud computing services to allow virtualization of heterogeneous clusters, or in combination with HPC cluster resource managers to form an integrated resource management infrastructure for heterogeneous clusters. Experiments conducted on a three-node cluster show that our GPU sharing scheme allows up to a 28% and a 50% performance improvement over serialized execution on short- and long-running jobs, respectively. Further, dynamic inter-node load balancing leads to an additional 18-20% performance benefit.

Tags: Cloud, Computer science, CUDA, GPU cluster, Heterogeneous systems, nVidia, PC cluster, Tesla C1060, Tesla C2050, Virtualization

July 27, 2012 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org