Time Predictability of GPU Kernel on an HSA Compliant Platform
Malardalen University, School of Innovation Design and Engineering, Vasteras, Sweden
Malardalen University, 2016
@article{larsson2016time,
title={TIME PREDICTABILITY OF GPU KERNEL ON AN HSA COMPLIANT PLATFORM},
author={Larsson, Marcus and Tsog, Nandinbaatar},
year={2016}
}
During recent years, the importance of utilizing more computational power in smaller computer systems has increased. The utilization of more computational power in smaller packages, the ability to combine more than one type of processor unit has become more popular in the industry. By combining, one achieves more power efficiency as well as gain more computational power in smaller area. However, heterogeneous programming has proved to be difficult, and that makes software developers diverge from learning heterogeneous programming languages. This has motivated HSA foundation to develop a new hardware architecture, called Heterogeneous System Architecture (HSA). This architecture brings features that make the process of heterogeneous programming development more accessible, efficient, and easier to the software developers. The purpose of this thesis is to investigate this new architecture, to learn and observe the timing characteristics of a task running a parallel region (a kernel) on a GPU in an HSA compliant system. With an objective to gain more knowledge, four test cases have been developed to collect time data and to analyze the time of the code executed on the GPU. These are: comparison between CPU and GPU, timing predictability of parallel periodic tasks, schedulability in HSA, and memory copy. Based on the results of the analysis, it has been concluded that the HSA has potential to be very attractive for developing heterogeneous programs due to its more streamlined infrastructure. It is easier to adapt, requires less knowledge regarding the underlying hardware, and the software developers can use their preferred programming languages, instead of learning new programming framework, such as OpenCL. However, since the architecture is new, there are bugs and HSA features that are yet to be incorporated into the drivers. Performance wise, HSA is faster compared to legacy methods, but lacks in providing consistent time predictability, which is important for real-time systems.
July 5, 2016 by hgpu