19003

Sorting on FPGAs using Merge Trees

Nikola Samardzic
UCLA
UCLA, 2019

@article{samardzic2019sorting,

   title={Sorting on FPGAs using Merge Trees},

   author={Samardzic, Nikola},

   year={2019}

}

Download Download (PDF)   View View   Source Source   

1653

views

Hardware Mergers can be used to implement sorting algorithms on Field-Programmable Gate Arrays (FPGAs) by inductively merging elements as in the Merge Sort algorithm.[1][2] These Hardware Mergers have also been laid out onto the FPGA in a complete binary tree pattern (called a Hardware Merge Tree) which further enhances performance of the sorting procedure by merging many input arrays at once.[3] In [4] the authors recognize the need for using different Merger Trees when sorting on different levels of the memory hierarchy (On-Chip, DRAM, or Flash). Nonetheless, their choice of Merge Trees at each level of the hierarchy is rather ad hoc and there is no rigorous analysis to justify their design choices. In this work, we develop a comprehensive analytical model that takes into consideration the off-chip memory bandwidth and the amount of on-chip resources to optimize the overall sorting latency. Further, we develop methods for implementing a wider range of Merge Trees as well as a general DRAM access pattern and data loading model that operates at peak DRAM bandwidth without affecting the throughput of the Merge Tree. As all signals in our Merge Tree are local to the tree nodes (Hardware Mergers), the design can be scaled to future improvements in FPGA resource availability as well as to a network of many interconnected FPGAs, without impact to the Merge Tree throughput. We present the only end-to-end sorting design on FPGAs aside from the design in [4]. Cycle-accurate simulations of our design report that it outperforms in total sorting time the implementation in [4] on the same hardware setup by 3.1x. Our cycle-accurate software simulations which account for DRAM random access and sequential access trade-offs demonstrates that our end-to-end design has total sorting time within 90% of the predictions of our model.
Rating: 1.0/5. From 1 vote.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: