18034

OpenCL-ready High Speed FPGA Network for Reconfigurable High Performance Computing

Ryohei Kobayashi, Yuma Oobata, Norihisa Fujita, Yoshiki Yamaguchi, Taisuke Boku
Center for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577 Japan
Center for Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan

@inproceedings{kobayashi2018opencl,

   title={OpenCL-ready High Speed FPGA Network for Reconfigurable High Performance Computing},

   author={Kobayashi, Ryohei and Oobata, Yuma and Fujita, Norihisa and Yamaguchi, Yoshiki and Boku, Taisuke},

   booktitle={Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region},

   pages={192–201},

   year={2018},

   organization={ACM}

}

Download Download (PDF)   View View   Source Source   

937

views

Field programmable gate arrays (FPGAs) have gained attention in high-performance computing (HPC) research because their computation and communication capabilities have dramatically improved in recent years as a result of improvements to semiconductor integration technologies that depend on Moore’s Law. In addition to FPGA performance improvements, OpenCL-based FPGA development toolchains have been developed and offered by FPGA vendors, which reduces the programming effort required as compared to the past. These improvements reveal the possibilities of realizing a concept to enable on-the-fly offloading computation at which CPUs/GPUs perform poorly to FPGAs while performing low-latency data movement. We think that this concept is one of the keys to more improve the performance of modern heterogeneous supercomputers using accelerators like GPUs. In this paper, we propose high-performance inter-FPGA Ethernet communication using OpenCL and Verilog HDL mixed programming in order to demonstrate the feasibility of realizing this concept. OpenCL is used to program application algorithms and data movement control when Verilog HDL is used to implement low-level components for Ethernet communication. Experimental results using ping-pong programs showed that our proposed approach achieves a latency of 0.99 ms and as much as 4.97 GB/s between FPGAs over different nodes, thus confirming that the proposed method is effective at realizing this concept.
No votes yet.
Please wait...

Recent source codes

* * *

* * *

HGPU group © 2010-2018 hgpu.org

All rights belong to the respective authors

Contact us: