high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of FPGA Datapaths

GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of FPGA Datapaths

Nachiket Kapre, Deheng Ye

School of Computer Engineering, Nanyang Technological University, Singapore 639798

International Symposium on Field-Programmable Gate Arrays, 2016

BibTeX

Download (PDF)

View

Source

Source codes

Package:

BitGPU: a GPU approach to solve the bitwidth optimization problem in FPGA datapaths

1893

views

Bitwidth optimization of FPGA datapaths can save hardware resources by choosing the fewest number of bits required for each datapath variable to achieve a desired quality of result. However, it is an NP-hard problem that requires unacceptably long runtimes when using sequential CPU-based heuristics. We show how to parallelize the key steps of bitwidth optimization on the GPU by performing a fast brute-force search over a carefully constrained search space. We develop a high-level synthesis methodology suitable for rapid prototyping of bitwidth-annotated RTL code generation using gcc’s GIMPLE backend. For range analysis, we perform parallel evaluation of sub-intervals to provide tighter bounds compared to ordinary interval arithmetic. For bitwidth allocation, we enumerate the different bitwidth combinations in parallel by assigning each combination to a GPU thread. We demonstrate up to 10-1000x speedups for range analysis and 50-200x speedups for bitwidth allocation when comparing NVIDIA K20 GPU implementation to an Intel Core i5-4570 CPU while maintaining identical solution quality across various benchmarks. This allows us to generate tailor-made RTL with minimum bitwidths in hundreds of milliseconds instead of hundreds of minutes when starting from high-level C descriptions of dataflow computations.

Tags: Benchmarking, Code generation, Computer science, CUDA, FPGA, nVidia, Package, Performance, Tesla K20

February 10, 2016 by hgpu

Rating: 0.5/5. From 1 vote.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of FPGA Datapaths

Package:

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of FPGA Datapaths

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)