A Case for Work-stealing on FPGAs with OpenCL Atomics

Nadesh Ramanathan, John Wickerson, Felix Winterstein, George A. Constantinides
Imperial College London, UK
FPGA, 2016


   title={A Case for Work-stealing on FPGAs with OpenCL Atomics},

   author={Ramanathan, Nadesh and Wickerson, John and Winterstein, Felix and Constantinides, George A},



Download Download (PDF)   View View   Source Source   



We provide a case study of work-stealing, a popular method for run-time load balancing, on FPGAs. Following the Cederman-Tsigas implementation for GPUs, we synchronize workitems not with locks, mutexes or critical sections, but instead with the atomic operations provided by Altera’s OpenCL SDK. We evaluate work-stealing for FPGAs by synthesizing a K-means clustering algorithm on an Altera P385 D5 board, both with work-stealing and with a statically-partitioned load. When block RAM utilization is maximized in both cases, we find that work-stealing leads to a 1.5x speedup. This demonstrates that the ability to do load balancing at run-time can outweigh the drawback of using "expensive" atomics on FPGAs. We hope that our case study will stimulate further research into the high-level synthesis of fine-grained, lock-free, concurrent programs.
Rating: 1.5/5. From 2 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2023 hgpu.org

All rights belong to the respective authors

Contact us: