An Automatic Input-Sensitive Approach for Heterogeneous Task Partitioning

Klaus Kofler, Ivan Grasso, Biagio Cosenza, Thomas Fahringer
Institute of Computer Science, University of Innsbruck, Austria
27th ACM international conference on Supercomputing, 2013

   author={Kofler, Klaus and Grasso, Ivan and Cosenza, Biagio and Fahringer, Thomas},

   title={An Automatic Input-Sensitive Approach for Heterogeneous Task Partitioning},

   booktitle={Proceedings of the 27th ACM international conference on Supercomputing},

   series={ICS ’13},


   location={Eugene, Oregon, USA},


   address={New York, NY, USA},

   keywords={heterogeneous computing, compilers, GPU, task partitioning, code analysis, machine learning, runtime system}


Download Download (PDF)   View View   Source Source   



Unleashing the full potential of heterogeneous systems, consisting of multi-core CPUs and GPUs, is a challenging task due to the difference in processing capabilities, memory availability, and communication latencies of different computational resources. In this paper we propose a novel approach that automatically optimizes task partitioning for different (input) problem sizes and different heterogeneous architectures. We use the Insieme source-to-source compiler to translate a single-device OpenCL program into a multi-device OpenCL program. The Insieme Runtime System then performs dynamic task partitioning based on an offline-generated prediction model. In order to derive the prediction model, we use a machine learning approach based on Artificial Neural Networks (ANN) that incorporates static program features as well as dynamic, input sensitive features. Principal component analysis have been used to further improve the task partitioning. Our approach has been evaluated over a suite of 23 programs and respectively achieves a performance improvement of 22% and 25% compared to an execution of the benchmarks on a single CPU and a single GPU which is equal to 87.5% of the optimal performance.
VN:F [1.9.22_1171]
Rating: 5.0/5 (1 vote cast)
An Automatic Input-Sensitive Approach for Heterogeneous Task Partitioning, 5.0 out of 5 based on 1 rating

* * *

* * *

Follow us on Twitter

HGPU group

1662 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

337 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: