high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters

Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters

Andrew J. Younge

Department of Computer Science, Indiana University

Indiana University, 2016

@phdthesis{younge2016architectural,

title={ARCHITECTURAL PRINCIPLES AND EXPERIMENTATION OF DISTRIBUTED HIGH PERFORMANCE VIRTUAL CLUSTERS},

author={Younge, Andrew J},

year={2016},

school={Indiana University}

}

Download (PDF)

View

Source

1528

views

With the advent of virtualization and Infrastructure-as-a-Service (IaaS), the broader scientific computing community is considering the use of clouds for their scientific computing needs. This is due to the relative scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for data-intensive applications. However, a notable performance gap exists between IaaS and typical high performance computing (HPC) resources. This has limited the applicability of IaaS for many potential users, not only for those who look to leverage the benefits of virtualization with traditional scientific computing applications, but also for the growing number of big data scientists whose platforms are unable to build on HPCs advanced hardware resources. Concurrently, we are at the forefront of a convergence in infrastructure between Big Data and HPC, the implications of which suggest that a unified distributed computing architecture could provide computing and storage capabilities for both differing distributed systems use cases. This dissertation proposes such an endeavor by leveraging the performance and advanced hardware from the HPC community and providing it in a virtualized infrastructure using High Performance Virtual Clusters. This will not only enable a more diverse user environment within supercomputing applications, but also bring increased performance and capabilities to big data platform services. The project begins with an evaluation of current hypervisors and their viability to run HPC workloads within current infrastructure, which helps define existing performance gaps. Next, mechanisms to enable the use of specialized hardware available in many HPC resources are uncovered, which include advanced accelerators like the Nvidia GPUs and high-speed, low-latency InfiniBand interconnects. The virtualized infrastructure that developed, which leverages such specialized HPC hardware and utilizes best-practices in virtualization using KVM, supports advanced scientific computations common in today‘s HPC systems. Specifically, we find that example Molecular Dynamics simulations can run at near-native performance, with only a 1-2% overhead in our virtual cluster. These advances are incorporated into a framework for constructing distributed virtual clusters using the OpenStack cloud infrastructure. With high performance virtual clusters, we look to support a broad range of scientific computing challenges, from HPC simulations to big data analytics with a single, unified infrastructure.

Tags: Cloud, Computer science, CUDA, Distributed computing, Molecular dynamics, nVidia, OpenCL, Tesla C2075, Tesla K20, Thesis, Virtualization

March 9, 2017 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters

Share this:

Recent source codes

Most viewed papers (last 30 days)