Hardware Checkpointing and Productive Debugging Flows for FPGAs

Sameh Attia
Electrical and Computer Engineering, University of Toronto
University of Toronto, 2022


   title={Hardware Checkpointing and Productive Debugging Flows for FPGAs},

   author={Attia, Sameh},



As FPGAs become larger and more complex, productive debugging is becoming more challenging. In this work, we detail a new debugging flow based on hardware checkpointing that provides full visibility and controllability while maintaining reasonable execution speed. Hardware checkpointing is useful not only for debugging but also enables several other capabilities such as live migration, fault recovery, and context switching; however, it has been difficult to achieve for FPGA applications. In this thesis, we overcome the challenges of checkpointing FPGA designs and realize the proposed checkpoint-based debugging flow. First, we propose techniques and wrappers that can safely interrupt a running design to create a consistent (restartable) checkpoint while avoiding hazards such as data loss or deadlock. We also develop approaches and tools that can access buried on-chip state that cannot be directly captured to create complete checkpoints. We next propose a checkpoint-based debugging framework, StateMover, that can seamlessly move the design state back and forth between an FPGA and a simulator, achieving the best of both worlds: speed and observability. Finally, we build a transaction-based co-simulation framework, StateLink, to extend the functionality of the proposed debugging flow to systems that cannot be entirely moved to a simulator such as CPU+FPGA accelerators or datacenter-scale applications. The combination of these tools enables new and productive debugging flows. StateMover allows designs to run at full hardware speed until a region of interest is approached. Then, a checkpoint can be loaded and its execution observed and controlled in a simulator. StateLink allows a designer-selected portion of the system to be moved into a simulation, enabling simulation speedups of up to 25x versus simulating the entire design. We demonstrate on several designs that StateMover and StateLink support can be added to a design with low resource and timing overhead, and illustrate the utility of the flow by debugging a complete Memcached system.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: