Understanding the Costs of Many-Task Computing Workloads on Intel Xeon Phi Coprocessors

hgpu.org » Applications » Computer science » Understanding the Costs of Many-Task Computing Workloads on Intel Xeon Phi Coprocessors

Understanding the Costs of Many-Task Computing Workloads on Intel Xeon Phi Coprocessors

Jeffrey Johnson, Scott J. Krieder, Benjamin Grimmer, Justin M. Wozniaky, Michael Wildeyz, Ioan Raicu

Department of Computer Science, Illinois Institute of Technology

2nd Greater Chicago Area System Research Workshop (GCASR), 2013

@article{johnson2013understanding,

title={Understanding the Costs of Many-Task Computing Workloads on Intel Xeon Phi Coprocessors},

author={Johnson, Jeffrey and Krieder, Scott J and Grimmer, Benjamin and Wozniak, Justin M and Wilde, Michael and Raicu, Ioan},

year={2013}

}

Download (PDF)

View

Source

2769

views

Many-Task Computing (MTC) aims to bridge the gap between HPC and HTC. MTC emphasizes running many computational tasks over a short period of time, where tasks can be either dependent or independent of one another. MTC has been well supported on Clouds, Grids, and Supercomputers on traditional computing architectures, but the abundance of hybrid large-scale systems using accelerators has motivated us to explore the support of MTC on the new Intel Xeon Phi accelerators. The Xeon Phi is a PCI-Express based expansion card comprised of 60 cores supporting 240 hardware threads to produce up to 1 teraflop of double- precision performance in a single accelerator. These cards are already being integrated into super-computing clusters such as Stampede, which hosts over 6,400 Xeon Phi Accelerators totaling in over 7 petaflops of double- precision performance. This work provides an in depth understanding of MTC on the Intel Xeon Phi and presents our preliminary results of running several different workloads on pre-production Intel Xeon Phi hardware. By utilizing Intel’s provided SCIF protocol for communicating across the PCI-Express bus we have achieved over 90% efficiency near or outperforming OpenMP offloading tasks over 300 uS with our batch framework. This performance opens the opportunity for the development of a framework for executing heterogeneous tasks on the Xeon Phi alongside other potential accelerators including graphics cards for MTC applications. Our framework will provide fine granularity for executing MTC applications across large scale compute clusters. It will be integrated with our existing graphics card framework, GeMTC, to provide transparent access to GPUs, Xeon Phis, and future generations of accelerators to help bridge the gap into Exascale computing.

Tags: Computer science, Heterogeneous systems, Intel, Intel Phi

October 21, 2013 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org