Automatic Generation Of Application-Specific Accelerators for FPGAs from Python Loop Nests
Electrical Engineering and Computer Sciences, University of California at Berkeley
University of California at Berkeley, Technical Report No. UCB/EECS-2012-203, 2012
@techreport{Sheffield:EECS-2012-203,
Author={Sheffield, David and Anderson, Michael and Keutzer, Kurt},
Title={Automatic Generation Of Application-Specific Accelerators for FPGAs from Python Loop Nests},
Institution={EECS Department, University of California, Berkeley},
year={2012},
Month={oct},
URL={http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-203.html},
Number={UCB/EECS-2012-203}
}
We present Three Fingered Jack, a highly productive approach to mapping vectorizable applications to the FPGA. Our system applies traditional dependence analysis and reordering transformations to a restricted set of Python loop nests. It does this to uncover parallelism and divide computation between multiple parallel processing elements (PEs) that are automatically generated through high-level synthesis of the optimized loop body. Design space exploration on the FPGA proceeds by varying the number of PEs in the system. Over four benchmark kernels, our system achieves 3x to 6x relative to soft-core C performance.
October 25, 2012 by hgpu