Customization of OpenCL Applications for Efficient Task Mapping under Heterogeneous Platform Constraints
Politecnico di Milano, Italy
International Conference on Design, Automation and Test in Europe (DATE 2015), 2015
@article{paone2015customization,
title={Customization of OpenCL Applications for Efficient Task Mapping under Heterogeneous Platform Constraints},
author={Paone, Edoardo and Robino, Francesco and Palermo, Gianluca and Zaccaria, Vittorio and Sander, Ingo and Silvano, Cristina},
year={2015}
}
When targeting an OpenCL application to platforms with multiple heterogeneous accelerators, task tuning and mapping have to cope with device-specific constraints. To address this problem, we present an innovative design flow for the customization and performance optimization of OpenCL applications on heterogeneous parallel platforms. It consists of two phases: 1) a tuning phase that optimizes each application kernel for a given platform and 2) a task-mapping phase that maximizes the overall application throughput by exploiting concurrency in the application task graph. The tuning phase is suitable for customizing parameterized OpenCL kernels considering device-specific constraints. Then, the mapping phase improves task-level parallelism for multi-device execution accounting for the overhead of memory transfers – overheads implied by multiple OpenCL contexts for different device vendors. Benefits of the proposed design flow have been assessed on a stereo-matching application targeting two commercial heterogeneous platforms.
January 2, 2015 by hgpu