Integrated Framework for Heterogeneous Embedded Platforms Using OpenCL

Kulin V. Seth
Northeastern University
Northeastern University, 2011


   title={Integrated Framework for Heterogeneous Embedded Platforms Using OpenCL},

   author={Seth, K.V.},



Download Download (PDF)   View View   Source Source   



The technology community is rapidly moving away from the age of computers and laptops, and is entering the emerging era of hand-held devices. With the rapid development of smart phones, tablets, and pads, there has been widespread adoption of Graphic Processing Units (GPUs) in the embedded space. The hand-held market is now seeing an ever increasing rate of development of computationally intensive applications, which require significant amounts of processing resources. To meet this challenge, GPUs can be used for general-purpose processing. We are moving towards a future where devices will be more connected and integrated. This will allow applications to run on handheld devices, while offloading computationally intensive tasks to other compute units available. There is a growing need for a general programming framework which can utilize heterogeneous processing units such as GPUs and DSPs on embedded platforms. OpenCL, a widely used programming framework has been a step towards integrating these different processing units on desktop platforms. Extending the use of OpenCL to the embedded space can potentially lead to the development of a new class of applications in the embedded domain. This thesis describes our efforts made in this direction. The main idea behind this thesis is to utilize OpenCL to benefit embedded applications as run on GPUs. This work provides an integrated toolchain, with a full-system simulation environment to support a platform with an ARM device and embedded GPU on it. The use of an integrated framework provides visibility and extensibility to perform end-to-end study of these platforms. This thesis pursues different levels of optimizations that can be carried out, namely source, compiler and micro-architectural level optimizations. Case studies presented consider the interaction between these levels and provide guidelines as a result of this student. The final goal of the thesis is to study performance improvements by running computationally intensive tasks on embedded GPU. Over 20 benchmarks, taken from OpenCL SDKs, are studied in this work. They are broadly categorized as signal processing kernels, image processing applications and general computational task. The simulated results were compared between different configurations of the GPU with CPU as the reference and an average speedup of around 348 times was seen for kernel execution time. This thesis work can be used as a research tool to study GPGPU computing on embedded devices.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: