An OpenCL Runtime and Scheduler for Embedded Multicore DSP Parallel Systems
Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China
Journal of Computational Information Systems 10: 10, 4059-4070, 2014
@article{tian2014opencl,
title={An OpenCL Runtime and Scheduler for Embedded Multicore DSP Parallel Systems},
author={Tian, Li and Zhou, Fugen and Meng, Cai},
year={2014}
}
We address the problem that multicore DSP system doesn’t support OpenCL programming. We designed compiler and proposed a runtime framework for TI multicore DSP, by which OpenCL parallel program could take advantage of multicore computing resource. Firstly, we make use of the LLVM and Clang compiler front-end to achieve source-to-source translation and in the next stage build translated kernel into the DSP dynamic module. Secondly, a new RTOS scheduler for kernel task is proposed to reduce context switch and enables switching between multiple work-item tasks. Finally we develop a software managed CACHE strategy to access distributed global memory in multiple DSP system with SRIO interconnections. The runtime effectively exposes to the user full computing resources in distributed multicore DSP for kernel execution. We evaluated the performance using some common OpenCL kernels from NVIDIA, NAS, AMD, and Parboil Benchmarks. Experimental results show that OpenCL application performs well in multicore DSP system.
May 18, 2014 by hgpu