Characterization of OpenCL on a Scalable FPGA Architecture
Pico Computing, Inc.
Pico Computing, Inc., 2014
@article{gao2014characterization,
title={Characterization of OpenCL on a Scalable FPGA Architecture},
author={Gao, Shanyuan and Chritz, Jeremy},
year={2014}
}
The recent release of Altera’s SDK for OpenCL has greatly eased the development of FPGA-based systems. Research have shown performance improvements brought by OpenCL using a single FPGA device. However, to meet the objectives of high performance computing, OpenCL needs to be evaluated using multiple FPGAs. This work has proposed a scalable FPGA architecture for high performance computing. The design includes multiple FPGA modules and a high performance backplane. The modular nature of this architecture supports the combination of different FPGAs, as well as provides for easy hardware updates. FPGA modules based on Stratix V are compatible with Altera’s OpenCL tool flow. The evaluation has tested the native IO performance of the architecture and the results have demonstrated scalability using six FPGAs. The host-to-device peak bandwidth is measured as 13.1 GB/s for read operation and 12.1 GB/s for write operation. The FPGA-to-memory bandwidth is measured as 64.5 GB/s in total. An OpenCL AES kernel is selected to test the scalable multi-FPGA architecture. The test results have shown peak throughput is achiveded when six FPGAs are used. The throughput per watt shows 5x improvement using four FPGAs, over a general-purpose processor.
December 30, 2014 by hgpu