Workload distribution and balancing in FPGAs and CPUs with OpenCL and TBB
Universidad de Malaga, Andalucia Tech, Spain
University of Bristol, 2015
@article{asenjo2016workload,
title={Workload distribution and balancing in FPGAs and CPUs with OpenCL and TBB},
author={Asenjoa, Rafael and Navarroa, Angeles and Rodrigueza, Andres and Nunez-Yanezb, Jose},
year={2016}
}
In this paper we evaluate the performance and energy effectiveness of FPGA and CPU devices for a kind of parallel computing applications in which the workload can be distributed in a way that enables simultaneous computing in addition to simple off loading. The FPGA device is programmed via OpenCL using the recent availability of commercial tools and hardware while Threading Building Blocks (TBB) is used to orchestrate the load distribution and balancing between FPGA and the multicore CPU. We focus on streaming applications that can be implemented as a pipeline of stages. We present an approach that allows the user to specify the mapping of the pipeline stages to the devices (FPGA, GPU or CPU) and the number of active threads. Using as a case study a real streaming application, we evaluate how these parameters affect the performance and energy efficiency using as reference a heterogeneous system that includes four different types of computational resources: a quad-core Intel Haswell CPU, an embedded Intel HD6000 GPU, a discrete NVIDIA GPU and an Altera FPGA.
February 8, 2016 by hgpu