5268

MDR: performance model driven runtime for heterogeneous parallel platforms

Jacques A. Pienaar, Anand Raghunathan, Srimat Chakradhar
Purdue University, West Lafayette, IN, USA
Proceedings of the international conference on Supercomputing, ICS ’11, 2011

@inproceedings{pienaar2011mdr,

   title={MDR: performance model driven runtime for heterogeneous parallel platforms},

   author={Pienaar, J.A. and Raghunathan, A. and Chakradhar, S.},

   booktitle={Proceedings of the international conference on Supercomputing},

   pages={225–234},

   year={2011},

   organization={ACM}

}

Source Source   

722

views

We present a runtime framework for the execution of work-loads represented as parallel-operator directed acyclic graphs (PO-DAGs) on heterogeneous multi-core platforms. PO-DAGs combine coarse-grained parallelism at the graph level with fine-grained parallelism within each node, lending naturally to exploiting the intra — and inter-processing element parallelism present in heterogeneous platforms. We identify four important criteria – Suitability, Locality, Availability and Criticality (SLAC) — and show that all these criteria must be considered by a heterogeneous runtime framework in order to achieve good performance under varying application and platform characteristics. The proposed model driven runtime (MDR) considers all the aforementioned factors, and tradeoffs among them, by utilizing performance models. These performance models are used to drive key run-time decisions such as mapping of tasks to PEs, scheduling of tasks on each PE, and copying data between memory spaces. We discuss the software architecture and implementation of MDR, and evaluate it using several benchmark programs on three different heterogeneous platforms that contain multi-core CPUs and GPUs. The hardware platforms represent server, laptop, and netbook class systems. MDR achieves up to 4.2X speedup (1.5X on average) over the best of CPU-only, GPU-only, round-robin, GPU-first, and utilization-driven schedulers. We also perform a sensitivity analysis that establishes the importance of considering all four SLAC criteria in order to achieve high performance execution in a heterogeneous runtime framework.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: