15277

A Case for Work-stealing on FPGAs with OpenCL Atomics

Nadesh Ramanathan, John Wickerson, Felix Winterstein, George A. Constantinides
Imperial College London, UK
FPGA, 2016
BibTeX

Download Download (PDF)   View View   Source Source   

2863

views

We provide a case study of work-stealing, a popular method for run-time load balancing, on FPGAs. Following the Cederman-Tsigas implementation for GPUs, we synchronize workitems not with locks, mutexes or critical sections, but instead with the atomic operations provided by Altera’s OpenCL SDK. We evaluate work-stealing for FPGAs by synthesizing a K-means clustering algorithm on an Altera P385 D5 board, both with work-stealing and with a statically-partitioned load. When block RAM utilization is maximized in both cases, we find that work-stealing leads to a 1.5x speedup. This demonstrates that the ability to do load balancing at run-time can outweigh the drawback of using "expensive" atomics on FPGAs. We hope that our case study will stimulate further research into the high-level synthesis of fine-grained, lock-free, concurrent programs.
Rating: 1.5/5. From 2 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org