Accelerating Double Precision Floating-point Hessenberg Reduction on FPGA and Multicore Architectures
CSCE Department, University of Arkansas
Symposium on Application Accelerators in High Performance Computing, 2010
@article{huangaccelerating,
title={Accelerating Double Precision Floating-point Hessenberg Reduction on FPGA and Multicore Architectures},
author={Huang, M. and Wang, L. and El-Ghazawi, T.},
booktitle={Application Accelerators in High Performance Computing, 2010 Symposium, Papers},
year={2010}
}
Double precision floating-point performance is critical for hardware acceleration technologies to be adopted by domain scientists. In this work we use the Hessenberg reduction to demonstrate the potential of FPGAs and GPUs for obtaining satisfactory double precision floating-point performance. Currently a Xeon (Nehalem) 2.26 GHz CPU can outperform Xilinx Virtex4LX200 by 3.6 folds. However, given higher frequency, more hardware resources and local memory banks, FPGAs have the potential to outperform multicore CPUs in the near future. On the GPU side, a GTX 480 (Fermi) achieves 19.4x speedup against the Xeon CPU. Based on the current trend, GPUs will keep widening the advantages against both FPGAs and CPUs on double precision floating-point performance.
February 18, 2011 by hgpu