Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters
Department of Computer Science, Indiana University
Indiana University, 2016
@phdthesis{younge2016architectural,
title={ARCHITECTURAL PRINCIPLES AND EXPERIMENTATION OF DISTRIBUTED HIGH PERFORMANCE VIRTUAL CLUSTERS},
author={Younge, Andrew J},
year={2016},
school={Indiana University}
}
With the advent of virtualization and Infrastructure-as-a-Service (IaaS), the broader scientific computing community is considering the use of clouds for their scientific computing needs. This is due to the relative scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for data-intensive applications. However, a notable performance gap exists between IaaS and typical high performance computing (HPC) resources. This has limited the applicability of IaaS for many potential users, not only for those who look to leverage the benefits of virtualization with traditional scientific computing applications, but also for the growing number of big data scientists whose platforms are unable to build on HPCs advanced hardware resources. Concurrently, we are at the forefront of a convergence in infrastructure between Big Data and HPC, the implications of which suggest that a unified distributed computing architecture could provide computing and storage capabilities for both differing distributed systems use cases. This dissertation proposes such an endeavor by leveraging the performance and advanced hardware from the HPC community and providing it in a virtualized infrastructure using High Performance Virtual Clusters. This will not only enable a more diverse user environment within supercomputing applications, but also bring increased performance and capabilities to big data platform services. The project begins with an evaluation of current hypervisors and their viability to run HPC workloads within current infrastructure, which helps define existing performance gaps. Next, mechanisms to enable the use of specialized hardware available in many HPC resources are uncovered, which include advanced accelerators like the Nvidia GPUs and high-speed, low-latency InfiniBand interconnects. The virtualized infrastructure that developed, which leverages such specialized HPC hardware and utilizes best-practices in virtualization using KVM, supports advanced scientific computations common in today‘s HPC systems. Specifically, we find that example Molecular Dynamics simulations can run at near-native performance, with only a 1-2% overhead in our virtual cluster. These advances are incorporated into a framework for constructing distributed virtual clusters using the OpenStack cloud infrastructure. With high performance virtual clusters, we look to support a broad range of scientific computing challenges, from HPC simulations to big data analytics with a single, unified infrastructure.
March 9, 2017 by hgpu